Agents Need Kernels, Not Guardrails
OS history solved process isolation structurally, not behaviorally. Agents need the same treatment: a kernel that controls what they can see and do.
By the late 1960s, systems like Multics tied each process to its own virtual address space. Other processes’ memory wasn’t hidden—it was unaddressable.
Not because processes were told not to access it. Not because a filter checked their outputs. Not because someone reviewed their requests.
The hardware memory management unit made it impossible to even name another process’s address space. The kernel enforced isolation structurally. A process couldn’t construct a valid pointer to memory it wasn’t authorized to access—unless that memory was explicitly mapped into its space.
Note: processes are isolated per-process, not per-user. Even two processes with the same uid get separate address spaces. Each invocation is independent.
Fifty years later, we’re solving agent security with the equivalent of “please don’t read other users’ memory”—missing that the real boundary isn’t the user. It’s the transaction.
The Diagnosis
We’ve covered why agent memory is architecturally broken and how the confused deputy problem applies to agents. The short version:
- Memory leakage: Agents can retrieve any semantically-similar chunk from shared stores. Multi-tenant data leaks through the agent’s brain.
- Capability crossing: Agents accumulate credentials and can use one principal’s authority while serving another’s transaction.
- Guardrails fail: Prompt engineering, output filtering, and human-in-the-loop are all behavioral mitigations for an architectural problem.
This post is about the solution: agents don’t need better guardrails. They need a kernel.
Terms (So We Mean the Same Thing)
- Transaction: the unit of authorization and isolation. It starts when a user- or system-initiated action is admitted and ends when its side effects commit.
- Context: the scoped working set bound to a transaction (capabilities, memory partitions, and designated inputs).
- Session: the user interaction channel that may span multiple transactions.
- Workflow/Task: the logical business process that can include multiple transactions.
Agents Can “Name” Everything
The architectural flaw in current agent frameworks is that agents can name things they shouldn’t access.
In operating systems, a process can’t construct a pointer to another process’s memory because virtual addresses only resolve within the process’s own page tables. The data isn’t hidden—it’s unaddressable.
In agent frameworks, there’s no equivalent. The agent can search all memory. The agent can invoke any credential it holds. The runtime trusts the agent to choose correctly.
This is the wrong trust model. The agent shouldn’t choose.
What Would a Kernel for Agents Look Like?
An operating system kernel does three things:
- Isolates processes — each process has its own address space
- Mediates resources — all I/O goes through syscalls
- Enforces policy — the kernel decides what’s allowed, not the process
An agent kernel does the same:
flowchart TB
subgraph kernel["Agent Runtime (Kernel)"]
direction TB
subgraph ctx["Context Manager"]
ctx1["Creates isolated execution contexts per transaction"]
ctx2["Binds capabilities to contexts"]
ctx3["Partitions memory by context"]
end
subgraph syscall["Syscall Interface"]
s1["tool_call(handle, params) → result"]
s2["memory_read(key) → value"]
s3["memory_write(key, value) → ok"]
s4["spawn_child(delegation) → handle"]
s5["log(event) → ok"]
end
subgraph sandbox["Sandbox Enforcer"]
sb1["Network isolation"]
sb2["Filesystem restrictions"]
sb3["All I/O through syscall interface"]
end
end
subgraph guest["Userspace"]
g1["Agent framework (LangGraph, CrewAI, etc.)"]
g2["LLM (untrusted plugin)"]
g3["Thinks it's running normally"]
g4["Has no idea it's sandboxed"]
end
kernel --> guestThe LLM becomes untrusted userspace. It receives scoped context. It returns tool calls. The kernel verifies those calls against the capabilities bound to this context.
Five Syscalls
For complete mediation, you need to mediate four things: effects (anything that changes the world), information (anything that enters context), delegation (who can act next), and audit (what happened). Five syscalls cover them:
tool_call(handle, params) → result # Effects (and external reads)
memory_read(key) → value # Information in
memory_write(key, value) → ok # Information out
spawn_child(delegation) → handle # Delegation with attenuation
log(event) → ok # Immutable audit trailEverything else—including LLM inference and retrieval—is a tool behind tool_call. The only way to reach the model is tool_call("llm.complete", ...). The only way to search memory is tool_call("retriever.search", ...), which returns (chunk, proof_of_scope) where proof_of_scope is a signed statement binding context_id, namespace, and chunk_hash to the retrieval policy in force.
No other I/O is allowed. Network is blocked. Filesystem is restricted to the sandbox. Direct HTTP clients, SDK caches, temp files—all either mediated or unavailable. If there’s an unmediated path, an attacker will find it and exfiltrate through it.
In this model, the kernel runtime is the Trust Plane: it holds credentials, verifies capability chains, and enforces constraints before any effect. The kernel is the enforcement point; Proof of Continuity is the cryptographic mechanism that binds authority to transactions.
Context lifecycle is implicit in the hosting model: each transaction runs inside a kernel-enforced isolation boundary (process, microVM, container, or hardware sandbox with strict I/O gating). Context begins at boundary start, ends at boundary exit. The guest never sees these boundaries—it just runs in an environment where the only way out is through the syscalls.
Context Isolation
The primary invariant: each transaction runs in isolation, with memory scoped to that invocation’s designated context.
This is stricter than per-user or per-tenant isolation. When a transaction arrives, the kernel creates an isolated execution context—a runtime environment with bound capabilities and partitioned memory (distinct from the LLM’s context window, which we address next):
Context_A (Alice's Transaction 1):
context_id: ctx_txn_001
capabilities: [Cap_A bound to opaque tool handles]
memory_partition: tenant_acme/user_alice/transaction_001/*
inherited_context: [explicitly mapped chunks from prior transactions]
Context_B (Alice's Transaction 2):
context_id: ctx_txn_002
capabilities: [Cap_B bound to opaque tool handles]
memory_partition: tenant_acme/user_alice/transaction_002/*
inherited_context: [] # Fresh start unless explicitly mappedTransaction 2 cannot access Transaction 1’s memory partition—even though it’s the same user. Prior context is only available if explicitly mapped into the new transaction’s scope. This prevents:
- Cross-session leakage: Sensitive data from one conversation can’t semantically bleed into another
- Accumulation attacks: Poisoned content in Transaction 1 can’t corrupt Transaction 2
- Persistent manipulation: SpAIware-style attacks that rely on long-lived memory writes
The agent processing Alice’s first transaction literally cannot name her second transaction’s partition—or even her first transaction’s partition when processing the second. It’s not filtered—it’s unaddressable.
Handles are unforgeable and context-bound—either OS-level object capabilities (like file descriptors, which can’t be reconstructed from bytes) or cryptographic handles bound to context_id with replay protection. They fail closed outside their originating context. The agent doesn’t choose which credentials to use—the kernel binds them before the agent runs.
This is the same guarantee processes get: fresh address space per invocation, even for the same user.
Ephemeral Reasoning, Externalized Memory
There’s a subtler isolation problem: conversation history.
In multi-turn agents, the conversation accumulates in the LLM’s context. If Transaction A and Transaction B share an agent runtime, Transaction A’s conversation might still be in context when Transaction B begins—even if they’re the same user’s transactions. The kernel controls tool access, but the LLM’s “working memory” leaks.
The solution: separate reasoning from memory, with transaction-bounded persistence.
| Concern | Current model | Kernel model |
|---|---|---|
| Reasoning (LLM thinking) | Persistent across transactions | Ephemeral per-transaction |
| Memory (agent knowledge) | Implicit in context window | Explicit, kernel-mediated |
| Writes | Immediate, unverified | Transaction-end, verified |
The kernel creates a fresh conversation buffer for each transaction. When the transaction ends, the buffer is destroyed. Proposed memory writes are verified against scope constraints before persisting.
Transaction lifecycle:
1. Context handle created (ctx_txn_123)
2. Designated memory loaded into working context
3. Agent reasons, proposes tool calls
4. Tool calls verified against capabilities
5. Agent proposes memory writes
6. Writes verified against scope, persisted if valid
7. Conversation buffer destroyedThis prevents accumulation attacks: even if a poisoned prompt triggers a malicious memory write in step 5, the kernel can reject it in step 6. And if it slips through, it’s scoped to that transaction’s partition—not the user’s entire memory.
Multi-turn conversation within a transaction is fine (all turns share the same context, same capabilities). But when the transaction ends, the working context is destroyed. The next transaction starts with a clean slate unless explicitly granted access to prior context.
What about persistent memory? Agents need to remember things across transactions. But memory becomes explicit:
- Memory injection at transaction start: Load authorized data into working context
- Memory access via tool calls: Dynamic lookups during execution, capability-checked
- Memory writes at transaction end: Persist learnings, scoped to the transaction’s capability, verified before commit
The kernel doesn’t prevent an agent from writing garbage to its own memory partition—that’s a userspace problem. But it guarantees that garbage stays in the designated partition. Other transactions—even from the same user—never see it unless explicitly granted access.
This is the segfault analogy: the kernel doesn’t prevent a process from corrupting its own memory. It prevents that corruption from affecting other processes. Same principle, applied to agent context.
Per-Tenant vs Per-Transaction: The Architectural Difference
Most security guidance stops at tenant isolation:
| Approach | What It Solves | What It Misses |
|---|---|---|
| Separate indexes per tenant | Customer A can’t see Customer B’s data | User’s sensitive Session 1 bleeds into their Session 2 |
| Metadata filters on retrieval | Queries scoped to current tenant | Semantic similarity ignores session boundaries |
| Row-level security | DB queries respect tenant boundaries | RAG retrieval isn’t a DB query |
Per-transaction scoping goes further:
| Invariant | Effect |
|---|---|
| Fresh context handle per transaction | Each invocation is independent |
| Memory partition per transaction | Cross-session leakage is unaddressable |
| Writes verified at transaction end | Poisoning can’t accumulate |
| Capabilities bound to context_id | Authority travels with the transaction |
The OS analogy: per-tenant is like per-user isolation (UIDs). Per-transaction is like per-process isolation (separate address spaces). Both matter. But per-process is the one that makes “bad” structurally inexpressible.
When you ask “can Agent B access this memory?”—the answer shouldn’t be “Agent B belongs to the same tenant.” It should be “Agent B has a capability that designates this memory for context ctx_req_456.”
The Market Is Saying the Same Thing
At KubeCon NA 2025, Google announced Agent Sandbox—a Kubernetes primitive built specifically for agent isolation:
Providing kernel-level isolation for agents that execute code and commands is non-negotiable.
The key design choice: isolation per transaction, not per user or per session:
Agentic code execution and computer use require an isolated sandbox to be provisioned for each task.
Google’s “task” maps to our “transaction”—both represent the per-invocation isolation boundary. Agent Sandbox provides the infrastructure layer (process isolation, network restrictions). The authorization layer that binds capabilities and memory to that boundary is what we’re describing.
Built on gVisor with Kata Containers support, Agent Sandbox provisions a fresh execution environment for every invocation. This is the OS model: per-process isolation, applied to agents.
Gartner predicts that over 50% of enterprises will use AI security platforms by 2028—platforms that “protect against AI-specific risks such as prompt injection, data leakage, and rogue agent actions.”
The demand is real:
- 24.9% of large enterprises cite security as their second-largest blocker to agent deployment (LangChain, Dec 2025)
- 80% have AI agents that have taken unintended actions (SailPoint, May 2025)
- 56% of enterprise IT leaders cite security as their top agentic AI concern (UiPath, 2025)
The question isn’t whether agents need kernels. It’s who builds the authorization layer that binds capabilities to transactions—not just tenants.
Defense in Depth
Here’s where the architecture diverges from “just another sandbox.”
The key is separating trust boundaries. “Kernel” is doing a lot of work in this post—it actually refers to multiple layers with different trust assumptions:
| Layer | What It Does | Trust Assumption |
|---|---|---|
| Host boundary | OS-enforced isolation per transaction (process/microVM/container) | Assumed secure (OS/container) |
| Syscall broker | Mediates all I/O, enforces sandbox (the “kernel runtime” from earlier) | May have bugs |
| Crypto core | Signatures, attenuation, revocation, counters | Formally verifiable target |
| Userspace | Framework + LLM client | Untrusted |
The syscall broker can be either (a) instantiated per-transaction (simplest) or (b) a shared host service. If shared, all broker replies must be context-bound and carry verifiable scope proofs—so a routing bug can’t silently cross contexts.
Structural isolation (“Tenant B’s data doesn’t exist in Tenant A’s addressable memory”) is enforced by the host boundary—each transaction runs inside its own OS-enforced isolation boundary with its own memory space. The syscall broker can have bugs, but it can’t violate isolation because that’s the OS’s job.
Apple’s security architecture illustrates why even more separation helps. Apple’s Platform Security Guide describes SPTM (Secure Page Table Monitor), which “provides a smaller attack surface that doesn’t rely on trust of the kernel” and “protects page table integrity even when attackers have kernel write capabilities.”
Agent runtimes need the same pattern. The crypto core lives in a separate, minimal component—small enough to target formal verification (a few thousand lines). It validates capability signatures, enforces delegation chains, tracks use counts, and checks revocation.
What the crypto core must guarantee (even if the syscall broker is compromised):
- Capability validity (signature verification)
- Delegation chain integrity (hash chains + signatures)
- Monotonic attenuation (each delegation can only restrict, never expand)
- Use count limits (authoritative state)
- Revocation enforcement
What a compromised syscall broker can still do:
- Misroute data it legitimately received (use a valid capability for the wrong transaction)
- Fail to enforce sandbox rules (allow unmediated I/O)
What a compromised syscall broker CANNOT do:
- Mint new capabilities
- Exceed use counts
- Forge delegation chains
- Use revoked capabilities
- Violate attenuation constraints
- Read another process’s memory (host boundary)
What lives in the syscall broker (non-core): routing, transport, I/O multiplexing, sandbox policy enforcement, and tool invocation wiring.
You audit a few thousand lines of crypto core. You don’t have to fully trust 50,000 lines of syscall broker. Even if the broker has bugs, the cryptographic invariants hold—and OS-enforced isolation holds regardless.
Structural Isolation vs. Behavioral Policy
The difference matters:
Behavioral policy: “The agent shouldn’t access Tenant B’s data when processing Tenant A’s transaction.”
Structural isolation: “When processing Tenant A’s transaction, Tenant B’s data doesn’t exist in the agent’s addressable memory.”
Behavioral policy can fail—through bugs, prompt injection, or confused deputy errors. Structural isolation can’t be bypassed because there’s nothing to bypass. The data isn’t there to leak.
This is why operating systems don’t rely on processes being well-behaved. The MMU doesn’t evaluate whether a memory access is “appropriate.” It checks whether the page is mapped. If it’s not mapped, the access faults. No policy evaluation. No behavioral hope.
The Kernel Advantage
A kernel-based architecture inverts the trust model:
| Current model | Kernel model |
|---|---|
| Agent has all credentials | Agent has opaque tool handles |
| Agent searches all memory | Agent queries its partition only |
| Runtime trusts agent’s choices | Runtime binds choices to context |
| Guardrails hope for compliance | Kernel enforces by construction |
| Bug in runtime = full compromise | Bug in runtime ≠ capability forgery |
The kernel doesn’t ask the agent to be good. It makes “bad” impossible to express.
What This Enables
The kernel now controls three things:
- What the agent can see — context assembly from capability-scoped memory
- What the agent can do — tool invocation through mediated syscalls
- What the agent can remember — writes to partitioned, capability-gated storage
When isolation is structural and cryptographic invariants are separated:
- Multi-tenant safety. Tenant A’s data can’t leak to Tenant B because it’s not in Tenant B’s addressable memory or context.
- Cross-transaction isolation. Alice’s conversation is destroyed before Bob’s transaction begins. If a transaction gets manipulated, the damage is scoped to that transaction’s capability. The kernel doesn’t prevent userspace problems—it prevents them from spreading.
- Autonomous deployment. You’re not trusting the LLM’s judgment—you’re constraining its authority. Prompt injection controls intent, not capability. An attacker can still cause in-scope damage (burn use counts, write garbage to the tenant’s own partition, trigger harmful-but-authorized actions)—but the kernel limits blast radius, even if it can’t guarantee correct behavior.
- In-scope harm is still possible. The kernel prevents cross-scope exfiltration and cross-context confused deputy; it does not prevent an agent from doing bad things that are genuinely authorized. That’s a positioning feature, not a bug.
- Small TCB. The cryptographic core is a few thousand lines. You verify that, not the entire stack.
- Provable authority chains. Every capability is cryptographically signed. Audit isn’t “what did the agent do?”—it’s “by what authority, delegated by whom, with what constraints?”
This is the difference between “the agent might leak data” and “the agent can’t name the data to leak it.”
The Path Forward
Operating systems didn’t get secure by writing better processes. They got secure by building kernels that don’t require processes to be well-behaved.
Agent security will follow the same path. The question isn’t how to make agents behave better. It’s how to build runtimes that enforce isolation structurally, with cryptographic guarantees that hold even when the runtime doesn’t.
seL4 proved you could formally verify an OS kernel. The same approach applies to agent runtimes: a small, verifiable core that enforces the properties that matter, with defense in depth so that bugs in the larger system don’t compromise cryptographic invariants.
Agents don’t need guardrails. They need kernels.
This is part of a series on agent architecture. See also: Your Agent’s Memory Is a Security Hole · The Confused Deputy Problem, Explained · Proof of Continuity
References
- Daley & Dennis (1968): Virtual Memory, Processes, and Sharing in Multics — processes tied to virtual address spaces
- Saltzer & Schroeder (1975): The Protection of Information in Computer Systems — complete mediation principle
- Apple (2024): Apple Platform Security Guide — SPTM and defense in depth
- Klein et al. (SOSP 2009): seL4: Formal Verification of an OS Kernel — verified kernel (~9.5k LoC C)