Coding Agents

Run Claude Code, Claude Agent SDK, OpenAI Codex CLI, and other AI coding agents with hardware-level isolation.

The Problem: AI Agents Execute Untrusted Code

Modern coding agents like Claude Code (opens in a new tab), Claude Agent SDK (opens in a new tab), and OpenAI Codex CLI (opens in a new tab) are powerful because they can:

Execute arbitrary shell commands
Read and write files anywhere on your filesystem
Install packages and dependencies
Make network requests
Access environment variables and credentials

This power comes with serious risk. The code these agents generate and execute is inherently untrusted - it's produced by an LLM following potentially manipulated instructions.

🚫

Real-World Vulnerabilities

In 2025, security researchers discovered 30+ vulnerabilities in AI coding tools (opens in a new tab) including:

CVE-2025-61260 (Codex CLI): Command injection via MCP server config
CVE-2025-54794 (Claude Code): Path restriction bypass enabling arbitrary command execution
CVE-2025-64660 (GitHub Copilot): Prompt injection leading to code execution

These aren't theoretical - they enable data exfiltration, credential theft, and full system compromise.

Why Hardware Isolation Matters

Most "sandboxing" solutions use containers, which share the host kernel:

┌─────────────────────────────────────────────────────────────┐
│                    Container "Sandbox"                       │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │                   AI Coding Agent                        │ │
│  │    Executes arbitrary code from LLM instructions        │ │
│  └──────────────────────────┬──────────────────────────────┘ │
│                             │ Kernel vulnerability?          │
│                             ▼                                │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │                   SHARED HOST KERNEL                     │ │
│  │           Full access to host system                     │ │
│  └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

Container escape vulnerabilities (opens in a new tab) like CVE-2025-31133 (runC) affect Docker, Kubernetes, and containerd. AI agents executing arbitrary code make these exploits trivially triggerable.

Akira uses hardware-level VM isolation:

┌─────────────────────────────────────────────────────────────┐
│                    Akira Micro-VM                            │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │                   AI Coding Agent                        │ │
│  │    Executes arbitrary code from LLM instructions        │ │
│  └──────────────────────────┬──────────────────────────────┘ │
│                             │                                │
│  ┌─────────────────────────────────────────────────────────┐ │
│  │               ISOLATED GUEST KERNEL                      │ │
│  │           Minimal attack surface                         │ │
│  └──────────────────────────┬──────────────────────────────┘ │
│                             │ Hardware virtualization        │
└─────────────────────────────┼───────────────────────────────┘
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                    HOST SYSTEM (Protected)                   │
│           Your credentials, files, and data are safe        │
└─────────────────────────────────────────────────────────────┘

Even if an agent is tricked into running malicious code, it cannot escape the VM.

Running Claude Code on Akira

Claude Code (opens in a new tab) is Anthropic's agentic coding tool that lives in your terminal. It can understand your entire codebase, execute commands, and iterate on complex tasks.

Why Run Claude Code in Akira?

Risk on Local Machine	Protected in Akira
Agent reads `~/.ssh/id_rsa`	SSH keys don't exist in sandbox
Agent exfiltrates `~/.aws/credentials`	No cloud credentials to steal
Agent runs `rm -rf /` (prompt injection)	Only sandbox filesystem affected
Agent installs malware via `curl \| bash`	Malware isolated to disposable VM
Agent leaks secrets via DNS lookup	Network isolation blocks exfiltration

Example: Claude Code in Akira Sandbox

import SandboxSDK from '@akiralabs/sandbox-sdk';
 
const client = new SandboxSDK({ apiKey: process.env.AKIRA_API_KEY });
 
async function runClaudeCode(repoUrl: string, task: string) {
  // Create isolated environment
  const sandbox = await client.sandboxes.create({
    image: 'akiralabs/akira-default-sandbox',
    resources: { cpus: 4, memory: 8192, storage: 50 },
    env_vars: {
      ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
      // Only pass credentials the agent actually needs
    },
  });
 
  try {
    // Install Claude Code
    await client.sandboxes.execute(sandbox.id, {
      command: 'npm install -g @anthropic-ai/claude-code',
      timeout: 120,
    });
 
    // Clone repository
    await client.sandboxes.execute(sandbox.id, {
      command: `git clone ${repoUrl} /app/repo`,
      timeout: 60,
    });
 
    // Run Claude Code on the task
    const result = await client.sandboxes.executeAsync(sandbox.id, {
      command: `cd /app/repo && claude "${task}"`,
    });
 
    // Stream output
    for await (const chunk of result) {
      if (chunk.stdout) process.stdout.write(chunk.stdout);
      if (chunk.stderr) process.stderr.write(chunk.stderr);
    }
 
    // Extract results
    const diff = await client.sandboxes.execute(sandbox.id, {
      command: 'cd /app/repo && git diff',
    });
 
    return diff.stdout;
  } finally {
    await client.sandboxes.delete(sandbox.id);
  }
}

Using the Claude Agent SDK with Akira

The Claude Agent SDK (opens in a new tab) gives you programmatic access to Claude Code's capabilities for building custom agents. It includes file operations, bash execution, and context management.

Production Architecture

For production deployments, run Agent SDK workloads in Akira:

import SandboxSDK from '@akiralabs/sandbox-sdk';
import { ClaudeSDKClient } from '@anthropic-ai/claude-agent-sdk';
 
const client = new SandboxSDK({ apiKey: process.env.AKIRA_API_KEY });
 
class IsolatedAgentRunner {
  private sandboxId: string | null = null;
 
  async initialize() {
    // Pre-warm a sandbox with dependencies
    const sandbox = await client.sandboxes.create({
      image: 'akiralabs/akira-default-sandbox',
      resources: { cpus: 4, memory: 8192, storage: 20 },
    });
    this.sandboxId = sandbox.id;
 
    // Install Agent SDK and dependencies
    await client.sandboxes.execute(this.sandboxId, {
      command: `
        npm init -y &&
        npm install @anthropic-ai/claude-agent-sdk typescript ts-node
      `,
      timeout: 120,
    });
 
    // Snapshot for fast restoration
    await client.sandboxes.snapshot(this.sandboxId, {
      name: 'agent-sdk-ready',
    });
  }
 
  async runTask(task: string, workspaceFiles: Record<string, string>) {
    // Restore fresh environment from snapshot
    const snapshots = await client.snapshots.list();
    const readySnapshot = snapshots.data.find(s => s.name === 'agent-sdk-ready');
 
    const sandbox = await client.snapshots.restore(readySnapshot.id, {
      name: `task-${Date.now()}`,
    });
 
    try {
      // Upload workspace files
      for (const [path, content] of Object.entries(workspaceFiles)) {
        // Upload files to sandbox
        const formData = new FormData();
        formData.append('file', new Blob([content]), path);
        formData.append('path', `/workspace/${path}`);
        await client.sandboxes.upload(sandbox.id, formData);
      }
 
      // Create agent script
      const agentScript = `
        import { query } from '@anthropic-ai/claude-agent-sdk';
 
        const result = await query({
          prompt: \`${task}\`,
          options: {
            workingDirectory: '/workspace',
            allowedCommands: ['npm', 'node', 'git', 'cat', 'ls'],
          }
        });
 
        console.log(JSON.stringify(result));
      `;
 
      await client.sandboxes.execute(sandbox.id, {
        command: `echo '${agentScript.replace(/'/g, "\\'")}' > /workspace/agent.ts`,
      });
 
      // Run the agent
      const result = await client.sandboxes.execute(sandbox.id, {
        command: 'cd /workspace && npx ts-node agent.ts',
        timeout: 300,
        env_vars: {
          ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
        },
      });
 
      return JSON.parse(result.stdout);
    } finally {
      await client.sandboxes.delete(sandbox.id);
    }
  }
}

Running OpenAI Codex CLI on Akira

OpenAI Codex CLI (opens in a new tab) is a lightweight coding agent that can read, change, and run code. It supports models like GPT-5-Codex optimized for software engineering.

Example: Codex CLI in Isolated Environment

import SandboxSDK from '@akiralabs/sandbox-sdk';
 
const client = new SandboxSDK({ apiKey: process.env.AKIRA_API_KEY });
 
async function runCodexTask(repoUrl: string, task: string) {
  const sandbox = await client.sandboxes.create({
    image: 'akiralabs/akira-default-sandbox',
    resources: { cpus: 2, memory: 4096 },
    env_vars: {
      OPENAI_API_KEY: process.env.OPENAI_API_KEY,
    },
  });
 
  try {
    // Install Codex CLI
    await client.sandboxes.execute(sandbox.id, {
      command: 'npm install -g @openai/codex',
      timeout: 60,
    });
 
    // Clone repository
    await client.sandboxes.execute(sandbox.id, {
      command: `git clone ${repoUrl} /app/repo`,
      timeout: 60,
    });
 
    // Run Codex with full autonomy (safe in sandbox)
    const result = await client.sandboxes.execute(sandbox.id, {
      command: `cd /app/repo && codex --approval-mode full-auto "${task}"`,
      timeout: 300,
    });
 
    return {
      output: result.stdout,
      changes: await client.sandboxes.execute(sandbox.id, {
        command: 'cd /app/repo && git diff',
      }),
    };
  } finally {
    await client.sandboxes.delete(sandbox.id);
  }
}

Full Autonomy, Zero Risk

Running Codex with --approval-mode full-auto on your local machine is dangerous - it executes commands without asking. In an Akira sandbox, full autonomy is safe because the agent can only affect the isolated environment.

Security Architecture

Credential Isolation

Never pass unnecessary credentials to sandboxes:

// BAD: Exposing all credentials
const sandbox = await client.sandboxes.create({
  image: 'akiralabs/akira-default-sandbox',
  env_vars: {
    AWS_ACCESS_KEY_ID: process.env.AWS_ACCESS_KEY_ID,
    AWS_SECRET_ACCESS_KEY: process.env.AWS_SECRET_ACCESS_KEY,
    DATABASE_URL: process.env.DATABASE_URL,
    GITHUB_TOKEN: process.env.GITHUB_TOKEN,
    STRIPE_SECRET_KEY: process.env.STRIPE_SECRET_KEY,
    // Agent now has access to everything!
  },
});
 
// GOOD: Minimal credentials for the task
const sandbox = await client.sandboxes.create({
  image: 'akiralabs/akira-default-sandbox',
  env_vars: {
    ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
    // Only what the agent needs to call the LLM API
  },
});

Network Isolation

Sandboxes have outbound internet by default but can be restricted:

// For sensitive tasks, use network-isolated sandboxes
// and proxy only necessary API calls through your backend

Ephemeral by Default

Each task gets a fresh sandbox - no state leakage between tasks:

async function processTask(task: Task) {
  const sandbox = await client.sandboxes.create({ /* ... */ });
 
  try {
    // Task runs in complete isolation
    await runAgent(sandbox.id, task);
  } finally {
    // Sandbox destroyed - nothing persists
    await client.sandboxes.delete(sandbox.id);
  }
  // Next task gets fresh environment
}

Comparison: Local vs Akira

Aspect	Running Locally	Running on Akira
SSH Keys	Accessible at `~/.ssh`	Not present
Cloud Credentials	In `~/.aws`, `~/.gcloud`	Not present
Browser Cookies	Accessible	Not present
Host Filesystem	Full access	Isolated
Network	Unrestricted	Controlled
Malware Persistence	Can survive	Destroyed with sandbox
Prompt Injection Impact	System compromise	Contained to sandbox
Recovery from Failure	Manual cleanup	Delete and recreate

Production Best Practices

1. Snapshot Base Environments

Pre-configure and snapshot environments for fast startup:

// One-time setup
const base = await client.sandboxes.create({
  image: 'akiralabs/akira-default-sandbox',
  resources: { cpus: 4, memory: 8192 },
});
 
await client.sandboxes.execute(base.id, {
  command: `
    npm install -g @anthropic-ai/claude-code @openai/codex typescript &&
    apt-get update && apt-get install -y python3 python3-pip
  `,
  timeout: 300,
});
 
await client.sandboxes.snapshot(base.id, { name: 'coding-agent-base' });
 
// Fast restoration for each task
const taskSandbox = await client.snapshots.restore(snapshotId, {
  name: `task-${taskId}`,
});

2. Implement Execution Timeouts

Prevent runaway agents:

const result = await client.sandboxes.execute(sandbox.id, {
  command: 'claude "refactor this codebase"',
  timeout: 600, // 10 minute max
});
 
if (result.status === 'timeout') {
  console.log('Agent exceeded time limit');
  await client.sandboxes.delete(sandbox.id);
}

3. Monitor and Log

Track agent behavior:

const logs = await client.sandboxes.logs({
  sandbox_id: sandbox.id,
  limit: 100,
});
 
for (const entry of logs) {
  console.log(`[${entry.timestamp}] ${entry.command} -> exit ${entry.exit_code}`);
}

4. Use Snapshots for Checkpoints

Save progress on long-running tasks:

// After each successful step
await client.sandboxes.snapshot(sandbox.id, {
  name: `checkpoint-step-${stepNumber}`,
});
 
// If something goes wrong, restore last good state
const restored = await client.snapshots.restore(checkpointId, {
  name: 'recovery',
});

Industry Guidance

⚠️

Gartner Prediction

"Through 2029, over 50% of successful cybersecurity attacks against AI agents will exploit access control issues, using direct or indirect prompt injection as an attack vector."

Gartner recommends companies "limit [AI coding] to a controlled, safe sandbox for execution."

Next Steps

Learn about autonomous agents
Explore parallel processing for batch tasks
Review the security model

Parallel Processing RL Environments