Network Policy Belongs in the VM Device Model
How Kalahari turns sandbox network options into VM device state through usernet, packet policy, packet leases, and scoped wake tokens.
Network access is one of the sharpest edges in an agent sandbox.
An agent often needs enough network to install packages, call APIs, fetch test fixtures, or run a preview. That does not mean it should inherit every network path available to the host. It also does not mean policy can be treated as a thin SDK flag over ordinary sockets.
For Kalahari, the network API is intentionally small:
await client.createSandbox({
network: {
mode: 'publicInternet',
dnsMode: 'useEgressPolicy',
},
});The implementation behind that option cannot be small in the same way. A VM network stack has virtio queues, guest-visible gateway topology, NAT state, DNS forwarding, packet admission, connection tracking, and async wakeups. If those pieces can disagree, the product API is just a suggestion.
That is the design rule Kalahari follows:
Network policy is not metadata attached to a sandbox. It is part of the VM device state that moves packets.
Defaults Are Product Policy
The current Kalahari runtime defaults to unrestricted networking when network is omitted. In public API terms, omitted networking behaves like mode: 'unrestricted' and dnsMode: 'unrestricted'.
That default is a product choice. It is not a recovery path for bad input.
The public TypeScript shape exposes these modes:
await client.createSandbox({
image: 'node:22-alpine',
network: {
mode: 'unrestricted', // or 'publicInternet' or 'denyAll'
dnsMode: 'denyAll',
allowList: ['198.51.100.20/32'],
},
});At the native boundary, unsupported network modes and unsupported DNS modes are errors. CIDR entries in allowList are parsed as IPv4 CIDR ranges, prefix lengths are bounded, hostnames are rejected, and the list is capped. Invalid input fails sandbox creation instead of silently widening access.
This distinction matters. A permissive default can be documented and changed by passing network: { mode: 'denyAll', dnsMode: 'denyAll' }. A malformed policy that quietly becomes broad network access is a security bug.
One API, Two Enforcement Layers
Kalahari turns the public network object into a user-mode TCP/IP stack, then wraps that stack in packet policy:
guest virtio-net
-> packet policy
-> user-mode TCP/IP stack
-> host socketsThe user-mode stack owns the guest-facing network segment, NAT, DNS forwarding, and host socket egress checks. mode: 'publicInternet' becomes an egress policy that only opens host sockets to globally routable destinations. mode: 'denyAll' prevents outbound guest NAT from opening host sockets. dnsMode separately controls the DNS forwarder path.
The packet policy sits closer to the virtio-net device. It is raw packet admission plus connection tracking. When an allowList is provided, Kalahari builds a packet policy from those IPv4 CIDRs. Without an allowList, Kalahari builds a packet policy that admits ordinary IPv4 and IPv6 packet destinations, DHCP, and ICMP, leaving the egress and DNS policies to decide whether host sockets can actually be opened.
The composition is important: packet admission and host-socket egress both have to line up. A packet may be well-formed and match the packet policy, but still fail later because the user-mode stack refuses the host connection. Likewise, the user-mode stack may have broad egress configured, but the packet wrapper can reject a destination before it reaches that layer.
That layering keeps the public API stable while placing enforcement where the relevant facts are visible.
Packet Policy Is Deliberately L3/L4
The packet policy layer is intentionally narrower than a full application firewall.
The packet policy evaluates Ethernet/IP packets, TCP and UDP ports, ICMP settings, DHCP exceptions, malformed headers, fragments, checksums, and inbound connection tracking. It does not perform DNS, TLS, HTTP, or body inspection. Domain-aware and evidence-driven stream authorization belongs in the stream lifecycle, where the networking layer can see DNS answers, SNI, HTTP host evidence, and the moment before a host socket is opened.
That prevents a common layering mistake: projecting a domain rule into raw packet admission as if a packet carried the same evidence as a stream. Packets carry destination addresses and ports. Streams can accumulate evidence. The code keeps those responsibilities separate.
The Guest Sees a Real Topology
The guest does not see “network policy.” It sees a network device.
Kalahari’s guest setup brings up eth0, assigns the default guest address, installs a default route through the gateway, installs the gateway neighbor entry, and writes resolv.conf to point at the gateway DNS forwarder. The constants are shared: the default gateway is 10.0.2.2, the default guest address is 10.0.2.15, and the gateway also acts as the DNS server from the guest’s point of view.
That canonical topology is what lets the VM layer reason about policy consistently. DNS forwarding, NAT, DHCP replies, ICMP, and packet admission are all expressed relative to the same guest segment instead of being rebuilt ad hoc in each caller.
Receive Packets Need Leases
Virtio-net receive looks simple until the timing gets interesting.
The backend can have a packet ready before the guest has posted a receive descriptor. The guest can post descriptors before the backend has any packet. A packet can be denied by policy after it has been observed. The VM can be parking, draining, or preparing for a zygote boundary while async network tasks are still capable of producing guest-bound data.
Kalahari handles that with packet leases. Receiving a packet returns a lease, not an owned packet that is immediately consumed. The lease exposes the packet bytes. Dropping the lease leaves the packet pending. Committing the lease consumes exactly the packet represented by that lease.
The virtio-net device relies on that contract. On RX, it leases a backend packet, validates the guest descriptor path, writes the virtio-net header and payload into guest memory, publishes the used-ring completion, and only then commits the lease. If there is no guest buffer, the lease is dropped and the packet remains available.
The user-mode network stack follows the same rule internally. Its leased receive packet is retained until commit, and if the lease is dropped before commit, the packet is requeued to the front of the guest-bound queue. The packet-policy layer wraps inner leases too: accepted inbound packets are audited when committed, while denied inbound packets are committed as discarded input so the stack can make progress without delivering them to the guest.
That lease protocol is what prevents “ghost receive” states where the backend believes a packet was taken but the guest never received it.
Wake Tokens Are Run-Scoped
Networking is full of asynchronous wakeups. TCP tasks wake when upstream bytes arrive. UDP and DNS tasks wake when datagrams are ready. The virtio-net device wakes when the backend may have guest-bound packets.
Those wakeups need more than a callback. They need scope.
A wake token is cancelable and scoped to the current network device run. The VMM creates the token and registers it with the backend. NAT, DNS, TCP, and UDP paths may clone that token into detached work, but the token has an active bit. When the backend replaces a token, the old token is canceled, and every clone observes the same canceled state.
That is a device-state invariant, not just a performance optimization. A stale task should not be able to wake a later run of the VM, and a zygote child should not inherit a parent-side wake authority that crosses the snapshot boundary.
Why The API Stays Small
The Kalahari API should not expose virtio descriptors, conntrack tables, packet leases, or wake-token lifetimes. Users should be able to say:
await client.createSandbox({
image: 'python:3.12-alpine',
network: { mode: 'denyAll', dnsMode: 'denyAll' },
});But the VM layer still has to enforce that choice through the paths that actually move bytes.
Agents change the network threat model.
An autonomous agent may run with repository contents, API keys, user credentials, internal service tokens, or generated artifacts that were never meant to leave the sandbox. Network access becomes both an exfiltration path and a way to act as a confused deputy.
That is why the implementation treats network policy as part of VM correctness:
- unsupported public modes fail at construction
- invalid CIDR lists fail at construction
- malformed packets, fragments, and unknown protocols are denied by packet policy
- oversized virtio-net frames are rejected or dropped through defined device paths
- guest-bound packets are consumed only after guest delivery is durable
- stale wake tokens lose the authority to notify a later device run
The product lesson is simple, but it is easy to miss: in a VM sandbox, network policy is not a wrapper around sockets. It belongs in the same correctness model as queues, descriptors, leases, wake tokens, and snapshot quiescence.