Skip to main content
← back to blog

Capabilities 101: The Security Primitive That Changes Everything

What capabilities are, how they differ from ACLs, and why they matter for AI agent security—but also why capabilities alone aren't enough.

security capabilities authorization technical fundamentals

Everyone building AI agents eventually hits the same wall: how do you give an agent access to do its job without giving it access to everything?

The answer—developed over 60 years of operating systems research—is capabilities. Not the vague “AI capabilities” people debate on Twitter/X. Actual capability tokens: cryptographic primitives that combine what you can access with proof that you’re allowed to access it.

This post explains what capabilities are, how they work, and why they’re essential for agent security. But it also explains why capabilities alone aren’t sufficient—and what’s missing.

The Problem Capabilities Solve

Traditional access control asks: “Who are you?”

You present credentials. The system looks you up in an access control list (ACL). If your identity has the right permissions, access is granted.

This works fine when humans make requests at human speed. It breaks down when:

  • Agents delegate to other agents. By the third hop, which identity should the ACL check?
  • Permissions need to narrow. Can Agent B have less access than Agent A gave it?
  • Decisions happen at machine speed. Can we really query a policy server for every action?

Capabilities flip the question: “What do you possess?”

Instead of checking identity against a list, you present a token that is your authorization. The token designates the resource and encodes your permissions. No identity lookup required.

What Is a Capability?

A capability is an unforgeable token that combines two things:

  1. Designation: Which resource does this refer to?
  2. Authority: What operations are permitted?

Alan Karp’s definition captures it precisely: “An unforgeable, transferable permission to use the resource it designates.”

The key insight is that designation and authority travel together. You can’t separate “I’m pointing at this file” from “I can write to it.” The reference is the permission.

Traditional (ACL):
  Subject → "I am Alice"
  System  → [lookup Alice in ACL for Resource X]
  System  → "Alice has write permission, access granted"

Capability:
  Subject → [presents capability token for Resource X with write permission]
  System  → [verifies token is valid and unforged]
  System  → "Token grants write permission, access granted"

The difference seems subtle but has profound security implications.

Three Properties That Matter

Unforgeable. You cannot create a capability to something you don’t have access to. In memory-safe languages, object references satisfy this—you can’t conjure a pointer to an object you’ve never received. In distributed systems, cryptographic signatures provide unforgeability.

Transferable. Capabilities can be passed from one entity to another. Alice can give Bob her capability to read a file. Bob now has that access without any central authority needing to update an ACL.

Attenuatable. When transferring a capability, you can narrow its permissions but never expand them. Alice can give Bob read-only access derived from her read-write capability. Bob cannot escalate back to write access.

ACLs vs Capabilities: A Comparison

AspectACL ModelCapability Model
Core Question”Who are you?""What do you possess?”
Where Permissions LiveCentralized list per resourceDistributed in tokens
DelegationRequires admin to update ACLPass the token
AttenuationRequires new ACL entryDerive restricted token
RevocationUpdate the listInvalidate the token
Confused DeputyVulnerableReduced (but not eliminated)

The confused deputy vulnerability is instructive. In ACL systems, a program might accidentally use its own elevated permissions to act on behalf of a malicious user. The program has permissions A and permissions B; nothing distinguishes which should apply to which transaction.

Capabilities reduce this surface because you must possess the specific capability to exercise it. But as we’ll see, possession alone doesn’t solve everything.

Classic Capabilities vs Object Capabilities (OCAP)

Here’s where terminology gets confusing. There are two distinct capability traditions, and conflating them causes problems.

Classic (Kernel-Level) Capability Systems

In systems like Hydra (1970s) or POSIX capabilities, a capability is a token stored in a kernel-managed table called a C-List (Capability List).

Process A's C-List:
┌───────┬────────────────┬─────────┐
│ Index │ Object         │ Rights  │
├───────┼────────────────┼─────────┤
│ 0     │ File /data/x   │ R       │
│ 1     │ File /data/y   │ RW      │
│ 2     │ Device /dev/z  │ X       │
└───────┴────────────────┴─────────┘

To access a resource, the process provides an index to its C-List. The kernel looks up the capability, verifies the rights, and performs the operation.

Key characteristic: The capability (index + rights) is separate from the object reference. The kernel mediates every access.

Object-Capability (OCAP) Systems

In OCAP systems like the E language, seL4, or Pony, the object reference itself is the capability. If you hold a reference to an object, you can invoke its methods. There’s no separate “rights” check.

// OCAP style (pseudocode)
function transferMoney(fromAccount, toAccount, amount) {
  // If you have the reference, you have the capability
  fromAccount.debit(amount);
  toAccount.credit(amount);
}

Key characteristic: Authority is embedded in the reference. No separate lookup. “Only connectivity begets connectivity”—you can only reach objects you’ve been given references to.

The Critical Differences

FeatureClassic CapabilityObject Capability (OCAP)
Authority PrimitiveIndex + Rights bitsUnforgeable object reference
EnforcementKernel checks a tableObject interface / language runtime
DelegationCopy entry in C-ListPass object reference in message
AttenuationModify rights bitsCreate wrapper/proxy object
Ambient AuthorityOften presentForbidden by design

The OCAP model is generally considered “purer” because it eliminates ambient authority entirely. In classic systems, a process might still access resources through paths outside its C-List (environment variables, global state). In strict OCAP, the only way to access anything is through references you’ve been explicitly given.

seL4: Capabilities in Practice

seL4, the formally verified microkernel, implements capabilities at the OS level:

  • CNodes (Capability Nodes) are arrays that hold capabilities
  • CSlots are individual positions in a CNode
  • CSpace is a thread’s complete capability namespace

When a thread wants to use a capability, it provides the CSlot index. The kernel verifies the capability exists and the operation is permitted. Authority derives entirely from what capabilities a thread possesses in its CSpace.

Thread A's CSpace:
┌─────────────────────────────────────┐
│ CNode (root)                        │
├─────────────────────────────────────┤
│ Slot 0: TCB cap (Thread Control)    │
│ Slot 1: Endpoint cap (IPC)          │
│ Slot 2: Untyped cap (Memory)        │
│ Slot 3: CNode cap (to another CNode)│
└─────────────────────────────────────┘

seL4 proves a critical property: if a thread doesn’t hold a capability, it cannot access the corresponding resource. Period. The kernel formally guarantees this.

Why Capabilities Alone Aren’t Enough

Capabilities elegantly solve the “what can you access?” problem. They don’t solve the “which of your capabilities should you use for this transaction?” problem.

This is the confused deputy problem, and it persists even in capability systems.

Consider: An agent legitimately receives Capability A for Customer X’s records and Capability B (a superuser key) for a compliance investigation. Both are valid. The agent possesses both legitimately.

Now the agent processes a transaction from Customer X. Nothing in the capability model prevents the agent from using Capability B instead of Capability A.

flowchart TB
    subgraph Agent["Agent's Capability Space"]
        capA["Capability A<br/>(Customer X only)"]
        capB["Capability B<br/>(Superuser access)"]
    end

    subgraph Transaction["Customer X's Transaction"]
        req["'Get my records'"]
    end

    req --> Agent

    capA -.->|"Correct choice"| result1["Customer X records"]
    capB -.->|"Wrong choice<br/>(but valid capability)"| result2["Any customer's records"]

    style capA stroke:#090
    style capB stroke:#f90

The capability model guarantees:

  • You can only exercise authority you possess
  • You can only possess authority explicitly granted

It does not guarantee:

  • You will exercise the right authority for this transaction

Capabilities shrink the attack surface (the agent can only access what it has capabilities for), but within that space, confusion remains possible.

What’s Missing: Provenance

The gap is provenance—knowing not just what capabilities exist, but why they exist and which should apply to a specific transaction.

When Customer X initiates a transaction, the system should bind:

  • This transactionThis specific capabilityThis execution context

That binding should travel with the transaction through every delegation hop. When the agent (or sub-agent, or sub-sub-agent) finally acts, the system verifies: “Are you using the capability designated for this transaction?”

This is what we call Proof of Continuity. Not just proof that you possess a capability, but proof that you’re the designated executor of a specific transaction, using the specific capability that was bound when the transaction began.

Capability alone:
  "Do you have a valid capability?" → Yes (but which one?)

Capability + Provenance:
  "Are you the designated continuation of Transaction T,
   operating with Capability C that was bound at origin?" → Verifiable

What Amla Labs Builds

We build on decades of capability research—Dennis and Van Horn (1966), seL4, Biscuit tokens—and extend it for AI agent workflows.

Our capability chains are append-only sequences of cryptographically signed blocks. Each block:

  1. Designates the resource and permissions (standard capability)
  2. Binds to a specific transaction context
  3. Designates the next authorized executor
  4. Can only attenuate (narrow), never expand

When an agent receives a capability chain, it’s not just “here’s what you can do.” It’s “here’s what you can do, for this specific transaction, and you must prove you’re the designated executor.”

Token theft becomes useless—the attacker can’t sign as the designated executor. Confused deputy attacks are structurally prevented—the transaction binding, not the agent’s choice, determines which capability applies.

Capabilities provide the foundation. Provenance provides the missing piece. Together, they enable authorization that actually works for autonomous systems.


References