Skip to main content
← back to blog

Agents Need Kernels, Not Guardrails

OS history solved process isolation structurally, not behaviorally. Agents need the same treatment: a kernel that controls what they can see and do.

architecture security kernel isolation

By the late 1960s, systems like Multics tied each process to its own virtual address space. Other processes’ memory wasn’t hidden—it was unaddressable.

Not because processes were told not to access it. Not because a filter checked their outputs. Not because someone reviewed their requests.

The hardware memory management unit made it impossible to even name another process’s address space. The kernel enforced isolation structurally. A process couldn’t construct a valid pointer to memory it wasn’t authorized to access—unless that memory was explicitly mapped into its space.

Note: processes are isolated per-process, not per-user. Even two processes with the same uid get separate address spaces. Each invocation is independent.

Fifty years later, we’re solving agent security with the equivalent of “please don’t read other users’ memory”—missing that the real boundary isn’t the user. It’s the transaction.

The Diagnosis

We’ve covered why agent memory is architecturally broken and how the confused deputy problem applies to agents. The short version:

  • Memory leakage: Agents can retrieve any semantically-similar chunk from shared stores. Multi-tenant data leaks through the agent’s brain.
  • Capability crossing: Agents accumulate credentials and can use one principal’s authority while serving another’s transaction.
  • Guardrails fail: Prompt engineering, output filtering, and human-in-the-loop are all behavioral mitigations for an architectural problem.

This post is about the solution: agents don’t need better guardrails. They need a kernel.

Terms (So We Mean the Same Thing)

  • Transaction: the unit of authorization and isolation. It starts when a user- or system-initiated action is admitted and ends when its side effects commit.
  • Context: the scoped working set bound to a transaction (capabilities, memory partitions, and designated inputs).
  • Session: the user interaction channel that may span multiple transactions.
  • Workflow/Task: the logical business process that can include multiple transactions.

Agents Can “Name” Everything

The architectural flaw in current agent frameworks is that agents can name things they shouldn’t access.

In operating systems, a process can’t construct a pointer to another process’s memory because virtual addresses only resolve within the process’s own page tables. The data isn’t hidden—it’s unaddressable.

In agent frameworks, there’s no equivalent. The agent can search all memory. The agent can invoke any credential it holds. The runtime trusts the agent to choose correctly.

This is the wrong trust model. The agent shouldn’t choose.

What Would a Kernel for Agents Look Like?

An operating system kernel does three things:

  1. Isolates processes — each process has its own address space
  2. Mediates resources — all I/O goes through syscalls
  3. Enforces policy — the kernel decides what’s allowed, not the process

An agent kernel does the same:

flowchart TB
    subgraph kernel["Agent Runtime (Kernel)"]
        direction TB
        subgraph ctx["Context Manager"]
            ctx1["Creates isolated execution contexts per transaction"]
            ctx2["Binds capabilities to contexts"]
            ctx3["Partitions memory by context"]
        end
        subgraph syscall["Syscall Interface"]
            s1["tool_call(handle, params) → result"]
            s2["memory_read(key) → value"]
            s3["memory_write(key, value) → ok"]
            s4["spawn_child(delegation) → handle"]
            s5["log(event) → ok"]
        end
        subgraph sandbox["Sandbox Enforcer"]
            sb1["Network isolation"]
            sb2["Filesystem restrictions"]
            sb3["All I/O through syscall interface"]
        end
    end
    subgraph guest["Userspace"]
        g1["Agent framework (LangGraph, CrewAI, etc.)"]
        g2["LLM (untrusted plugin)"]
        g3["Thinks it's running normally"]
        g4["Has no idea it's sandboxed"]
    end
    kernel --> guest

The LLM becomes untrusted userspace. It receives scoped context. It returns tool calls. The kernel verifies those calls against the capabilities bound to this context.

Five Syscalls

For complete mediation, you need to mediate four things: effects (anything that changes the world), information (anything that enters context), delegation (who can act next), and audit (what happened). Five syscalls cover them:

tool_call(handle, params) → result    # Effects (and external reads)
memory_read(key) → value              # Information in
memory_write(key, value) → ok         # Information out
spawn_child(delegation) → handle      # Delegation with attenuation
log(event) → ok                       # Immutable audit trail

Everything else—including LLM inference and retrieval—is a tool behind tool_call. The only way to reach the model is tool_call("llm.complete", ...). The only way to search memory is tool_call("retriever.search", ...), which returns (chunk, proof_of_scope) where proof_of_scope is a signed statement binding context_id, namespace, and chunk_hash to the retrieval policy in force.

No other I/O is allowed. Network is blocked. Filesystem is restricted to the sandbox. Direct HTTP clients, SDK caches, temp files—all either mediated or unavailable. If there’s an unmediated path, an attacker will find it and exfiltrate through it.

In this model, the kernel runtime is the Trust Plane: it holds credentials, verifies capability chains, and enforces constraints before any effect. The kernel is the enforcement point; Proof of Continuity is the cryptographic mechanism that binds authority to transactions.

Context lifecycle is implicit in the hosting model: each transaction runs inside a kernel-enforced isolation boundary (process, microVM, container, or hardware sandbox with strict I/O gating). Context begins at boundary start, ends at boundary exit. The guest never sees these boundaries—it just runs in an environment where the only way out is through the syscalls.

Context Isolation

The primary invariant: each transaction runs in isolation, with memory scoped to that invocation’s designated context.

This is stricter than per-user or per-tenant isolation. When a transaction arrives, the kernel creates an isolated execution context—a runtime environment with bound capabilities and partitioned memory (distinct from the LLM’s context window, which we address next):

Context_A (Alice's Transaction 1):
  context_id: ctx_txn_001
  capabilities: [Cap_A bound to opaque tool handles]
  memory_partition: tenant_acme/user_alice/transaction_001/*
  inherited_context: [explicitly mapped chunks from prior transactions]

Context_B (Alice's Transaction 2):
  context_id: ctx_txn_002
  capabilities: [Cap_B bound to opaque tool handles]
  memory_partition: tenant_acme/user_alice/transaction_002/*
  inherited_context: []  # Fresh start unless explicitly mapped

Transaction 2 cannot access Transaction 1’s memory partition—even though it’s the same user. Prior context is only available if explicitly mapped into the new transaction’s scope. This prevents:

  • Cross-session leakage: Sensitive data from one conversation can’t semantically bleed into another
  • Accumulation attacks: Poisoned content in Transaction 1 can’t corrupt Transaction 2
  • Persistent manipulation: SpAIware-style attacks that rely on long-lived memory writes

The agent processing Alice’s first transaction literally cannot name her second transaction’s partition—or even her first transaction’s partition when processing the second. It’s not filtered—it’s unaddressable.

Handles are unforgeable and context-bound—either OS-level object capabilities (like file descriptors, which can’t be reconstructed from bytes) or cryptographic handles bound to context_id with replay protection. They fail closed outside their originating context. The agent doesn’t choose which credentials to use—the kernel binds them before the agent runs.

This is the same guarantee processes get: fresh address space per invocation, even for the same user.

Ephemeral Reasoning, Externalized Memory

There’s a subtler isolation problem: conversation history.

In multi-turn agents, the conversation accumulates in the LLM’s context. If Transaction A and Transaction B share an agent runtime, Transaction A’s conversation might still be in context when Transaction B begins—even if they’re the same user’s transactions. The kernel controls tool access, but the LLM’s “working memory” leaks.

The solution: separate reasoning from memory, with transaction-bounded persistence.

ConcernCurrent modelKernel model
Reasoning (LLM thinking)Persistent across transactionsEphemeral per-transaction
Memory (agent knowledge)Implicit in context windowExplicit, kernel-mediated
WritesImmediate, unverifiedTransaction-end, verified

The kernel creates a fresh conversation buffer for each transaction. When the transaction ends, the buffer is destroyed. Proposed memory writes are verified against scope constraints before persisting.

Transaction lifecycle:
1. Context handle created (ctx_txn_123)
2. Designated memory loaded into working context
3. Agent reasons, proposes tool calls
4. Tool calls verified against capabilities
5. Agent proposes memory writes
6. Writes verified against scope, persisted if valid
7. Conversation buffer destroyed

This prevents accumulation attacks: even if a poisoned prompt triggers a malicious memory write in step 5, the kernel can reject it in step 6. And if it slips through, it’s scoped to that transaction’s partition—not the user’s entire memory.

Multi-turn conversation within a transaction is fine (all turns share the same context, same capabilities). But when the transaction ends, the working context is destroyed. The next transaction starts with a clean slate unless explicitly granted access to prior context.

What about persistent memory? Agents need to remember things across transactions. But memory becomes explicit:

  • Memory injection at transaction start: Load authorized data into working context
  • Memory access via tool calls: Dynamic lookups during execution, capability-checked
  • Memory writes at transaction end: Persist learnings, scoped to the transaction’s capability, verified before commit

The kernel doesn’t prevent an agent from writing garbage to its own memory partition—that’s a userspace problem. But it guarantees that garbage stays in the designated partition. Other transactions—even from the same user—never see it unless explicitly granted access.

This is the segfault analogy: the kernel doesn’t prevent a process from corrupting its own memory. It prevents that corruption from affecting other processes. Same principle, applied to agent context.

Per-Tenant vs Per-Transaction: The Architectural Difference

Most security guidance stops at tenant isolation:

ApproachWhat It SolvesWhat It Misses
Separate indexes per tenantCustomer A can’t see Customer B’s dataUser’s sensitive Session 1 bleeds into their Session 2
Metadata filters on retrievalQueries scoped to current tenantSemantic similarity ignores session boundaries
Row-level securityDB queries respect tenant boundariesRAG retrieval isn’t a DB query

Per-transaction scoping goes further:

InvariantEffect
Fresh context handle per transactionEach invocation is independent
Memory partition per transactionCross-session leakage is unaddressable
Writes verified at transaction endPoisoning can’t accumulate
Capabilities bound to context_idAuthority travels with the transaction

The OS analogy: per-tenant is like per-user isolation (UIDs). Per-transaction is like per-process isolation (separate address spaces). Both matter. But per-process is the one that makes “bad” structurally inexpressible.

When you ask “can Agent B access this memory?”—the answer shouldn’t be “Agent B belongs to the same tenant.” It should be “Agent B has a capability that designates this memory for context ctx_req_456.”

The Market Is Saying the Same Thing

At KubeCon NA 2025, Google announced Agent Sandbox—a Kubernetes primitive built specifically for agent isolation:

Providing kernel-level isolation for agents that execute code and commands is non-negotiable.

The key design choice: isolation per transaction, not per user or per session:

Agentic code execution and computer use require an isolated sandbox to be provisioned for each task.

Google’s “task” maps to our “transaction”—both represent the per-invocation isolation boundary. Agent Sandbox provides the infrastructure layer (process isolation, network restrictions). The authorization layer that binds capabilities and memory to that boundary is what we’re describing.

Built on gVisor with Kata Containers support, Agent Sandbox provisions a fresh execution environment for every invocation. This is the OS model: per-process isolation, applied to agents.

Gartner predicts that over 50% of enterprises will use AI security platforms by 2028—platforms that “protect against AI-specific risks such as prompt injection, data leakage, and rogue agent actions.”

The demand is real:

  • 24.9% of large enterprises cite security as their second-largest blocker to agent deployment (LangChain, Dec 2025)
  • 80% have AI agents that have taken unintended actions (SailPoint, May 2025)
  • 56% of enterprise IT leaders cite security as their top agentic AI concern (UiPath, 2025)

The question isn’t whether agents need kernels. It’s who builds the authorization layer that binds capabilities to transactions—not just tenants.

Defense in Depth

Here’s where the architecture diverges from “just another sandbox.”

The key is separating trust boundaries. “Kernel” is doing a lot of work in this post—it actually refers to multiple layers with different trust assumptions:

LayerWhat It DoesTrust Assumption
Host boundaryOS-enforced isolation per transaction (process/microVM/container)Assumed secure (OS/container)
Syscall brokerMediates all I/O, enforces sandbox (the “kernel runtime” from earlier)May have bugs
Crypto coreSignatures, attenuation, revocation, countersFormally verifiable target
UserspaceFramework + LLM clientUntrusted

The syscall broker can be either (a) instantiated per-transaction (simplest) or (b) a shared host service. If shared, all broker replies must be context-bound and carry verifiable scope proofs—so a routing bug can’t silently cross contexts.

Structural isolation (“Tenant B’s data doesn’t exist in Tenant A’s addressable memory”) is enforced by the host boundary—each transaction runs inside its own OS-enforced isolation boundary with its own memory space. The syscall broker can have bugs, but it can’t violate isolation because that’s the OS’s job.

Apple’s security architecture illustrates why even more separation helps. Apple’s Platform Security Guide describes SPTM (Secure Page Table Monitor), which “provides a smaller attack surface that doesn’t rely on trust of the kernel” and “protects page table integrity even when attackers have kernel write capabilities.”

Agent runtimes need the same pattern. The crypto core lives in a separate, minimal component—small enough to target formal verification (a few thousand lines). It validates capability signatures, enforces delegation chains, tracks use counts, and checks revocation.

What the crypto core must guarantee (even if the syscall broker is compromised):

  • Capability validity (signature verification)
  • Delegation chain integrity (hash chains + signatures)
  • Monotonic attenuation (each delegation can only restrict, never expand)
  • Use count limits (authoritative state)
  • Revocation enforcement

What a compromised syscall broker can still do:

  • Misroute data it legitimately received (use a valid capability for the wrong transaction)
  • Fail to enforce sandbox rules (allow unmediated I/O)

What a compromised syscall broker CANNOT do:

  • Mint new capabilities
  • Exceed use counts
  • Forge delegation chains
  • Use revoked capabilities
  • Violate attenuation constraints
  • Read another process’s memory (host boundary)

What lives in the syscall broker (non-core): routing, transport, I/O multiplexing, sandbox policy enforcement, and tool invocation wiring.

You audit a few thousand lines of crypto core. You don’t have to fully trust 50,000 lines of syscall broker. Even if the broker has bugs, the cryptographic invariants hold—and OS-enforced isolation holds regardless.

Structural Isolation vs. Behavioral Policy

The difference matters:

Behavioral policy: “The agent shouldn’t access Tenant B’s data when processing Tenant A’s transaction.”

Structural isolation: “When processing Tenant A’s transaction, Tenant B’s data doesn’t exist in the agent’s addressable memory.”

Behavioral policy can fail—through bugs, prompt injection, or confused deputy errors. Structural isolation can’t be bypassed because there’s nothing to bypass. The data isn’t there to leak.

This is why operating systems don’t rely on processes being well-behaved. The MMU doesn’t evaluate whether a memory access is “appropriate.” It checks whether the page is mapped. If it’s not mapped, the access faults. No policy evaluation. No behavioral hope.

The Kernel Advantage

A kernel-based architecture inverts the trust model:

Current modelKernel model
Agent has all credentialsAgent has opaque tool handles
Agent searches all memoryAgent queries its partition only
Runtime trusts agent’s choicesRuntime binds choices to context
Guardrails hope for complianceKernel enforces by construction
Bug in runtime = full compromiseBug in runtime ≠ capability forgery

The kernel doesn’t ask the agent to be good. It makes “bad” impossible to express.

What This Enables

The kernel now controls three things:

  • What the agent can see — context assembly from capability-scoped memory
  • What the agent can do — tool invocation through mediated syscalls
  • What the agent can remember — writes to partitioned, capability-gated storage

When isolation is structural and cryptographic invariants are separated:

  • Multi-tenant safety. Tenant A’s data can’t leak to Tenant B because it’s not in Tenant B’s addressable memory or context.
  • Cross-transaction isolation. Alice’s conversation is destroyed before Bob’s transaction begins. If a transaction gets manipulated, the damage is scoped to that transaction’s capability. The kernel doesn’t prevent userspace problems—it prevents them from spreading.
  • Autonomous deployment. You’re not trusting the LLM’s judgment—you’re constraining its authority. Prompt injection controls intent, not capability. An attacker can still cause in-scope damage (burn use counts, write garbage to the tenant’s own partition, trigger harmful-but-authorized actions)—but the kernel limits blast radius, even if it can’t guarantee correct behavior.
  • In-scope harm is still possible. The kernel prevents cross-scope exfiltration and cross-context confused deputy; it does not prevent an agent from doing bad things that are genuinely authorized. That’s a positioning feature, not a bug.
  • Small TCB. The cryptographic core is a few thousand lines. You verify that, not the entire stack.
  • Provable authority chains. Every capability is cryptographically signed. Audit isn’t “what did the agent do?”—it’s “by what authority, delegated by whom, with what constraints?”

This is the difference between “the agent might leak data” and “the agent can’t name the data to leak it.”

The Path Forward

Operating systems didn’t get secure by writing better processes. They got secure by building kernels that don’t require processes to be well-behaved.

Agent security will follow the same path. The question isn’t how to make agents behave better. It’s how to build runtimes that enforce isolation structurally, with cryptographic guarantees that hold even when the runtime doesn’t.

seL4 proved you could formally verify an OS kernel. The same approach applies to agent runtimes: a small, verifiable core that enforces the properties that matter, with defense in depth so that bugs in the larger system don’t compromise cryptographic invariants.

Agents don’t need guardrails. They need kernels.


This is part of a series on agent architecture. See also: Your Agent’s Memory Is a Security Hole · The Confused Deputy Problem, Explained · Proof of Continuity

References