Coding Agents
Run Claude Code, Claude Agent SDK, OpenAI Codex CLI, and other AI coding agents with hardware-level isolation.
The Problem: AI Agents Execute Untrusted Code
Modern coding agents like Claude Code (opens in a new tab), Claude Agent SDK (opens in a new tab), and OpenAI Codex CLI (opens in a new tab) are powerful because they can:
- Execute arbitrary shell commands
- Read and write files anywhere on your filesystem
- Install packages and dependencies
- Make network requests
- Access environment variables and credentials
This power comes with serious risk. The code these agents generate and execute is inherently untrusted - it's produced by an LLM following potentially manipulated instructions.
Real-World Vulnerabilities
In 2025, security researchers discovered 30+ vulnerabilities in AI coding tools (opens in a new tab) including:
- CVE-2025-61260 (Codex CLI): Command injection via MCP server config
- CVE-2025-54794 (Claude Code): Path restriction bypass enabling arbitrary command execution
- CVE-2025-64660 (GitHub Copilot): Prompt injection leading to code execution
These aren't theoretical - they enable data exfiltration, credential theft, and full system compromise.
Why Hardware Isolation Matters
Most "sandboxing" solutions use containers, which share the host kernel:
┌─────────────────────────────────────────────────────────────┐
│ Container "Sandbox" │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ AI Coding Agent │ │
│ │ Executes arbitrary code from LLM instructions │ │
│ └──────────────────────────┬──────────────────────────────┘ │
│ │ Kernel vulnerability? │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ SHARED HOST KERNEL │ │
│ │ Full access to host system │ │
│ └─────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘Container escape vulnerabilities (opens in a new tab) like CVE-2025-31133 (runC) affect Docker, Kubernetes, and containerd. AI agents executing arbitrary code make these exploits trivially triggerable.
Akira uses hardware-level VM isolation:
┌─────────────────────────────────────────────────────────────┐
│ Akira Micro-VM │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ AI Coding Agent │ │
│ │ Executes arbitrary code from LLM instructions │ │
│ └──────────────────────────┬──────────────────────────────┘ │
│ │ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ ISOLATED GUEST KERNEL │ │
│ │ Minimal attack surface │ │
│ └──────────────────────────┬──────────────────────────────┘ │
│ │ Hardware virtualization │
└─────────────────────────────┼───────────────────────────────┘
▼
┌─────────────────────────────────────────────────────────────┐
│ HOST SYSTEM (Protected) │
│ Your credentials, files, and data are safe │
└─────────────────────────────────────────────────────────────┘Even if an agent is tricked into running malicious code, it cannot escape the VM.
Running Claude Code on Akira
Claude Code (opens in a new tab) is Anthropic's agentic coding tool that lives in your terminal. It can understand your entire codebase, execute commands, and iterate on complex tasks.
Why Run Claude Code in Akira?
| Risk on Local Machine | Protected in Akira |
|---|---|
Agent reads ~/.ssh/id_rsa | SSH keys don't exist in sandbox |
Agent exfiltrates ~/.aws/credentials | No cloud credentials to steal |
Agent runs rm -rf / (prompt injection) | Only sandbox filesystem affected |
Agent installs malware via curl | bash | Malware isolated to disposable VM |
| Agent leaks secrets via DNS lookup | Network isolation blocks exfiltration |
Example: Claude Code in Akira Sandbox
import SandboxSDK from '@akiralabs/sandbox-sdk';
const client = new SandboxSDK({ apiKey: process.env.AKIRA_API_KEY });
async function runClaudeCode(repoUrl: string, task: string) {
// Create isolated environment
const sandbox = await client.sandboxes.create({
image: 'akiralabs/akira-default-sandbox',
resources: { cpus: 4, memory: 8192, storage: 50 },
env_vars: {
ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
// Only pass credentials the agent actually needs
},
});
try {
// Install Claude Code
await client.sandboxes.execute(sandbox.id, {
command: 'npm install -g @anthropic-ai/claude-code',
timeout: 120,
});
// Clone repository
await client.sandboxes.execute(sandbox.id, {
command: `git clone ${repoUrl} /app/repo`,
timeout: 60,
});
// Run Claude Code on the task
const result = await client.sandboxes.executeAsync(sandbox.id, {
command: `cd /app/repo && claude "${task}"`,
});
// Stream output
for await (const chunk of result) {
if (chunk.stdout) process.stdout.write(chunk.stdout);
if (chunk.stderr) process.stderr.write(chunk.stderr);
}
// Extract results
const diff = await client.sandboxes.execute(sandbox.id, {
command: 'cd /app/repo && git diff',
});
return diff.stdout;
} finally {
await client.sandboxes.delete(sandbox.id);
}
}Using the Claude Agent SDK with Akira
The Claude Agent SDK (opens in a new tab) gives you programmatic access to Claude Code's capabilities for building custom agents. It includes file operations, bash execution, and context management.
Production Architecture
For production deployments, run Agent SDK workloads in Akira:
import SandboxSDK from '@akiralabs/sandbox-sdk';
import { ClaudeSDKClient } from '@anthropic-ai/claude-agent-sdk';
const client = new SandboxSDK({ apiKey: process.env.AKIRA_API_KEY });
class IsolatedAgentRunner {
private sandboxId: string | null = null;
async initialize() {
// Pre-warm a sandbox with dependencies
const sandbox = await client.sandboxes.create({
image: 'akiralabs/akira-default-sandbox',
resources: { cpus: 4, memory: 8192, storage: 20 },
});
this.sandboxId = sandbox.id;
// Install Agent SDK and dependencies
await client.sandboxes.execute(this.sandboxId, {
command: `
npm init -y &&
npm install @anthropic-ai/claude-agent-sdk typescript ts-node
`,
timeout: 120,
});
// Snapshot for fast restoration
await client.sandboxes.snapshot(this.sandboxId, {
name: 'agent-sdk-ready',
});
}
async runTask(task: string, workspaceFiles: Record<string, string>) {
// Restore fresh environment from snapshot
const snapshots = await client.snapshots.list();
const readySnapshot = snapshots.data.find(s => s.name === 'agent-sdk-ready');
const sandbox = await client.snapshots.restore(readySnapshot.id, {
name: `task-${Date.now()}`,
});
try {
// Upload workspace files
for (const [path, content] of Object.entries(workspaceFiles)) {
// Upload files to sandbox
const formData = new FormData();
formData.append('file', new Blob([content]), path);
formData.append('path', `/workspace/${path}`);
await client.sandboxes.upload(sandbox.id, formData);
}
// Create agent script
const agentScript = `
import { query } from '@anthropic-ai/claude-agent-sdk';
const result = await query({
prompt: \`${task}\`,
options: {
workingDirectory: '/workspace',
allowedCommands: ['npm', 'node', 'git', 'cat', 'ls'],
}
});
console.log(JSON.stringify(result));
`;
await client.sandboxes.execute(sandbox.id, {
command: `echo '${agentScript.replace(/'/g, "\\'")}' > /workspace/agent.ts`,
});
// Run the agent
const result = await client.sandboxes.execute(sandbox.id, {
command: 'cd /workspace && npx ts-node agent.ts',
timeout: 300,
env_vars: {
ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
},
});
return JSON.parse(result.stdout);
} finally {
await client.sandboxes.delete(sandbox.id);
}
}
}Running OpenAI Codex CLI on Akira
OpenAI Codex CLI (opens in a new tab) is a lightweight coding agent that can read, change, and run code. It supports models like GPT-5-Codex optimized for software engineering.
Example: Codex CLI in Isolated Environment
import SandboxSDK from '@akiralabs/sandbox-sdk';
const client = new SandboxSDK({ apiKey: process.env.AKIRA_API_KEY });
async function runCodexTask(repoUrl: string, task: string) {
const sandbox = await client.sandboxes.create({
image: 'akiralabs/akira-default-sandbox',
resources: { cpus: 2, memory: 4096 },
env_vars: {
OPENAI_API_KEY: process.env.OPENAI_API_KEY,
},
});
try {
// Install Codex CLI
await client.sandboxes.execute(sandbox.id, {
command: 'npm install -g @openai/codex',
timeout: 60,
});
// Clone repository
await client.sandboxes.execute(sandbox.id, {
command: `git clone ${repoUrl} /app/repo`,
timeout: 60,
});
// Run Codex with full autonomy (safe in sandbox)
const result = await client.sandboxes.execute(sandbox.id, {
command: `cd /app/repo && codex --approval-mode full-auto "${task}"`,
timeout: 300,
});
return {
output: result.stdout,
changes: await client.sandboxes.execute(sandbox.id, {
command: 'cd /app/repo && git diff',
}),
};
} finally {
await client.sandboxes.delete(sandbox.id);
}
}Full Autonomy, Zero Risk
Running Codex with --approval-mode full-auto on your local machine is dangerous - it executes commands without asking. In an Akira sandbox, full autonomy is safe because the agent can only affect the isolated environment.
Security Architecture
Credential Isolation
Never pass unnecessary credentials to sandboxes:
// BAD: Exposing all credentials
const sandbox = await client.sandboxes.create({
image: 'akiralabs/akira-default-sandbox',
env_vars: {
AWS_ACCESS_KEY_ID: process.env.AWS_ACCESS_KEY_ID,
AWS_SECRET_ACCESS_KEY: process.env.AWS_SECRET_ACCESS_KEY,
DATABASE_URL: process.env.DATABASE_URL,
GITHUB_TOKEN: process.env.GITHUB_TOKEN,
STRIPE_SECRET_KEY: process.env.STRIPE_SECRET_KEY,
// Agent now has access to everything!
},
});
// GOOD: Minimal credentials for the task
const sandbox = await client.sandboxes.create({
image: 'akiralabs/akira-default-sandbox',
env_vars: {
ANTHROPIC_API_KEY: process.env.ANTHROPIC_API_KEY,
// Only what the agent needs to call the LLM API
},
});Network Isolation
Sandboxes have outbound internet by default but can be restricted:
// For sensitive tasks, use network-isolated sandboxes
// and proxy only necessary API calls through your backendEphemeral by Default
Each task gets a fresh sandbox - no state leakage between tasks:
async function processTask(task: Task) {
const sandbox = await client.sandboxes.create({ /* ... */ });
try {
// Task runs in complete isolation
await runAgent(sandbox.id, task);
} finally {
// Sandbox destroyed - nothing persists
await client.sandboxes.delete(sandbox.id);
}
// Next task gets fresh environment
}Comparison: Local vs Akira
| Aspect | Running Locally | Running on Akira |
|---|---|---|
| SSH Keys | Accessible at ~/.ssh | Not present |
| Cloud Credentials | In ~/.aws, ~/.gcloud | Not present |
| Browser Cookies | Accessible | Not present |
| Host Filesystem | Full access | Isolated |
| Network | Unrestricted | Controlled |
| Malware Persistence | Can survive | Destroyed with sandbox |
| Prompt Injection Impact | System compromise | Contained to sandbox |
| Recovery from Failure | Manual cleanup | Delete and recreate |
Production Best Practices
1. Snapshot Base Environments
Pre-configure and snapshot environments for fast startup:
// One-time setup
const base = await client.sandboxes.create({
image: 'akiralabs/akira-default-sandbox',
resources: { cpus: 4, memory: 8192 },
});
await client.sandboxes.execute(base.id, {
command: `
npm install -g @anthropic-ai/claude-code @openai/codex typescript &&
apt-get update && apt-get install -y python3 python3-pip
`,
timeout: 300,
});
await client.sandboxes.snapshot(base.id, { name: 'coding-agent-base' });
// Fast restoration for each task
const taskSandbox = await client.snapshots.restore(snapshotId, {
name: `task-${taskId}`,
});2. Implement Execution Timeouts
Prevent runaway agents:
const result = await client.sandboxes.execute(sandbox.id, {
command: 'claude "refactor this codebase"',
timeout: 600, // 10 minute max
});
if (result.status === 'timeout') {
console.log('Agent exceeded time limit');
await client.sandboxes.delete(sandbox.id);
}3. Monitor and Log
Track agent behavior:
const logs = await client.sandboxes.logs({
sandbox_id: sandbox.id,
limit: 100,
});
for (const entry of logs) {
console.log(`[${entry.timestamp}] ${entry.command} -> exit ${entry.exit_code}`);
}4. Use Snapshots for Checkpoints
Save progress on long-running tasks:
// After each successful step
await client.sandboxes.snapshot(sandbox.id, {
name: `checkpoint-step-${stepNumber}`,
});
// If something goes wrong, restore last good state
const restored = await client.snapshots.restore(checkpointId, {
name: 'recovery',
});Industry Guidance
Gartner Prediction
"Through 2029, over 50% of successful cybersecurity attacks against AI agents will exploit access control issues, using direct or indirect prompt injection as an attack vector."
Gartner recommends companies "limit [AI coding] to a controlled, safe sandbox for execution."
Next Steps
- Learn about autonomous agents
- Explore parallel processing for batch tasks
- Review the security model