Coming Soon

Give agents a scratchpad

Code Mode for AI agents. Process data locally, return only what matters.

  • 13MB binary¹
  • <10ms cold start²
  • Zero infrastructure
  • Full audit trail

¹WASM binary size. ²Measured on Ryzen 9 9900X, browser WASM instantiation.

See it in action

How agents use the sandbox

Agent Workflow
U
User
Summarize this 50MB sales CSV and tell me the top 3 products by revenue.
A
Agent
$
Sandbox
# Process 50MB locally
A
Agent → User
Input: 50MB Context: 89 bytes

Try the sandbox now

Live in your browser

Live Shell Demo

Run shell commands, JavaScript, and explore the virtual filesystem—all in your browser.

~13MB download • Cached for future visits

WebAssembly Memory
Animation demonstrates deterministic replay: external API calls are captured during live execution and injected during replay to produce identical results.

Debug any workflow, anytime

External calls are captured. Replay produces exactly identical results.

The context bloat problem

Anthropic reduced internal tool definitions from 150K→2K tokens (98.7%) with code execution.

Without sandbox
Agent: stripe.list(...)
→ 47KB JSON to context
Agent: stripe.get(...)
→ 3KB more to context

50KB+ per query

Context fills up fast. Costs skyrocket.

With sandbox
Agent: writes JS to sandbox
→ Results stored in /workspace/
→ jq '.txn_id' → "txn_456"

~100 bytes in context

Process data locally. Return only what matters.

Core features

  • File Scratchpad

    Store results, process locally, return only what matters.

  • Shell Applets

    grep, cut, sort, jq. Pipes, redirects, heredocs just work.

  • Capability Tokens

    Constrain parameters, limit calls, enforce patterns.

  • Deterministic Replay

    Coroutine protocol. Step, yield, fully reproducible.

Enterprise ready, zero config

No VMs. No containers. No cloud dependencies.

  • pip install

    One command. Works in CI, notebooks, and production. No infrastructure to provision.

  • $0 marginal cost

    WASM runs in your process. No cloud API calls. Scale to millions at CPU cost only.

  • Full audit trail

    Every tool call logged with timestamps. Deterministic replay for debugging.

Capabilities as code

Define what agents can do. Every tool call is validated.

capabilities.py
sandbox = Sandbox(
    capabilities=[
        MethodCapability(
            method_pattern="stripe/charges/*",
            constraints=[
                Param("amount").lte(10000),  # 10000 cents = $100 max
                Param("currency").is_in(["USD", "EUR"]),
            ],
            max_calls=100,
        ),
    ],
)
  • Pattern matching stripe/** • */create
  • Constraint DSL comparisons, sets
  • Call budgets prevent runaway

How it compares

amla-sandbox

Setup
pip install
Isolation
WASM sandbox
Cold start
<10ms
Authorization
Capability tokens
Replay
Deterministic
Context
File scratchpad

eval()

No isolation, full code injection risk

Local Shell

No isolation, full host access

E2B

Remote API, 200–500ms cold start

Docker/VM

Heavy infra, 1–10s cold start, ops overhead

Other sandboxes focus on isolation. amla adds authorization, deterministic replay, and context budget control—managing what agents can do and see, not just where they run.

Frequently Asked Questions

For Engineers

How does the scratchpad work?
A POSIX-like filesystem in WASM memory. Files persist for the session. Store API responses, intermediate computations, or scratch data—then extract only what you need.
What shell commands are available?
grep, cut, sort, uniq, head, tail, wc, cat, jq, tr, find, and more. WebAssembly applets, not system calls. Pipes work.
Which LLM frameworks are supported?
LangGraph and CrewAI adapters ship out of the box. The core Sandbox class works with any framework.
Why is authorization built into the sandbox?
The sandbox already intercepts every tool call—they yield to the host, not execute directly. That's the natural chokepoint for authz. External authorization would add a hop, lose execution context, and break deterministic replay. Plus, capability tokens enable secure agent-to-agent delegation: when your agent spawns sub-agents, authority automatically attenuates.

For Platform Teams

Does data leave my infrastructure?
Never. amla-sandbox runs entirely in your process—no cloud calls, no data exfiltration. Your code, your data, your control.
What does deployment look like?
pip install amla-sandbox. That's it. No containers, no VMs, no cloud accounts. Empower developers to move fast without waiting on IT.
What's the security model?
WASM memory isolation + capability tokens. No direct network or syscalls from inside the sandbox—only host-mediated tool calls through a single chokepoint. Enterprise-friendly: zero attack surface expansion.
How do capability tokens work?
Unforgeable tokens specify which methods the sandbox can call and with what constraints. Every tool call is validated. Full audit trail included.

Capabilities scale naturally to multi-agent architectures—when Agent A delegates to Agent B, it can only grant a subset of its own authority. Attenuation is cryptographically enforced, not configured.
Technical Architecture

The Sandbox Binary

A 13MB statically-linked binary containing a WebAssembly runtime, virtual filesystem, and capability interpreter. Ships with no external dependencies.

Execution Model

Every external API call is intercepted and validated against the capability chain. Reads and writes go to a copy-on-write overlay. The agent never touches the real filesystem.

Capture Format

The WASM runtime is constrained so all external effects flow through host-mediated calls under full control. We record inputs (API responses, file reads, timestamps) in a compact binary format. Replay substitutes these values exactly, making execution deterministic.

Capability Attenuation

When an agent delegates to a sub-agent, it can only grant a subset of its own capabilities. The sandbox enforces this at the API boundary—no configuration required.

Give your agents a scratchpad

13MB binary. Zero infrastructure. Runs anywhere Python runs.

pip install amla-sandbox Soon