The Missing Layer: Authorization After Identity

Okta just published a seven-part series on AI agent security that’s worth your attention. Not because it solves the problem—because it maps the problem precisely.

Their thesis: human-centric IAM fails at machine speed. When agents execute 5,000 operations per minute, consent-based models collapse. Traditional identity infrastructure assumes sessions with a start and end, credentials issued once and managed later. That model breaks when applied to autonomous systems that run continuously for weeks.

Okta’s response is thoughtful: lifecycle-aware authorization, contextual access evaluation, fine-grained permissions. They’re extending identity infrastructure into the agent domain.

But identity infrastructure can only extend so far.

Foundation Capital’s “System of Agents” thesis (Oct 2024) pegs a $4.6T shift from software-as-a-tool to service-as-software—agents owning the data-creation path and executing work. That reinforces why the execution path, not just the identity store, is the control point that needs structural authorization.

Their follow-up on agent reliability quantifies why structural authorization matters alongside model accuracy: 99% per-step accuracy only yields ~37% success over 100 steps; 99.9% yields ~90%. Those extra nines are statistical reliability. The remaining 0.1% failure case still needs to be bounded by capability scope—proof that the request is the authorized continuation of the transaction.

The Three-Layer Problem

Every enterprise deploying AI agents faces what we call the Triple Dilemma:

Challenge 1: Identity — Who is this agent?

Authentication and provenance. This is where Okta, Auth0, Keycard, and SPIFFE operate. Mature solutions exist. Agents can be registered, authenticated, and tracked as non-human identities with their own credentials.

Challenge 2: Orchestration — What should this agent do?

Workflow management and transaction routing. This is where OpenAI’s Agents SDK, LangChain, CrewAI, and similar platforms operate. Also mature. Agents can be composed into multi-agent systems with clear orchestration patterns.

Challenge 3: Capability — What can this agent actually do right now?

Fine-grained authorization that survives delegation, attenuates at each hop, and enforces constraints cryptographically. This layer is not standardized for multi-agent systems. Pieces exist—Macaroons in production, SPIFFE chaining, internal ledger systems—but no interoperable infrastructure that agents can rely on across organizational boundaries.

The gap is visible in canonical resources. Agentic Design Patterns, a 424-page guide covering 21 patterns for building AI agents, includes patterns for prompt chaining, routing, parallelization, tool use, multi-agent orchestration, MCP integration, and guardrails—but no pattern for delegation authorization. Its A2A (agent-to-agent) security section relies on OAuth 2.0 tokens and mutual TLS: mechanisms that authenticate entities, not transactions.

Identity providers see the gap too. Okta’s series explicitly acknowledges it. From their delegation chain article:

“By the third delegation hop, there is no cryptographic link to the initiating agent or user. Without cryptographic proof, malicious agents can forge delegation claims and access resources they shouldn’t reach.”

And later:

“OAuth tokens validate structure and status but lack historical traceability.”

We read this as acknowledgment of a structural gap—though Okta’s path forward emphasizes policy, telemetry, and governance rather than cryptographic chains. Our interpretation: the problem they’re describing requires transaction-level provenance, not just better identity management.

This is the crux. You can authenticate an agent (identity layer). You can orchestrate its workflow (orchestration layer). But you cannot prove that the agent invoking an API is the authorized continuation of a legitimate workflow with appropriately scoped permissions.

Why Identity Alone Can’t Solve This

Okta’s solutions—Token Vault, Fine-Grained Authorization, Identity Governance—are genuinely useful. They address lifecycle management, short-lived credentials, and policy-based access control. Modern IAM systems evaluate contextual signals, dynamic conditions, and delegated permissions in sophisticated ways.

But they operate within the identity paradigm: verifying who an entity is and what policies apply to it. That paradigm answers “what can this agent do?” It doesn’t answer “is this the authorized continuation of that specific transaction?”

Multi-hop agent authorization requires a different paradigm: verifying that this specific transaction is the authorized continuation of that specific transaction, with constraints that accumulated at each delegation hop.

Consider a refund workflow:

User authorizes a refund agent
Refund agent spawns a payment sub-agent
Payment sub-agent invokes Stripe API

At step 3, traditional identity asks: “Is this a valid payment agent with appropriate credentials?”

That’s the wrong question.

The right question: “Is this transaction the legitimate continuation of a transaction that started with user authorization, passed through a refund agent that added a $500 limit constraint, and now reaches the payment API through the designated execution path?”

That question requires provenance—a cryptographic chain linking each hop. Identity providers verify the entity. Capability infrastructure verifies the transaction.

The Confused Deputy at Scale

The underlying problem has a name in security research: the confused deputy. A deputy (agent) receives authority detached from the context of why it was granted. The deputy acts on that authority without understanding whether the action aligns with original intent.

In traditional software, confused deputies cause bugs. In multi-agent systems—where delegation is frequent, async, and crosses trust boundaries—they cause breaches at scale.

Okta’s series documents specific attack patterns:

Agent Session Smuggling (Unit 42): Attackers hijack agent sessions mid-workflow
Cross-Agent Privilege Escalation (Johann Rehberger): Agents manipulated to exceed intended permissions
EchoLeak (CVE-2025-32711): Tool-use agents exploited through context manipulation

All exploit the same gap: permissions that don’t narrow at each hop.

The Salesloft Drift breach is instructive. OAuth tokens valid for months were compromised and used to access 700+ organizations. The tokens were legitimate—properly issued, correctly structured, authorized by the identity provider. They just shouldn’t have still existed. This wasn’t an architectural impossibility—rotation and revocation mechanisms exist. But identity systems permit drift; they don’t structurally prevent it.

Authorization drift describes credentials persisting beyond their intended scope. The average credential stays active 47 days after it’s no longer needed. In that window, attackers don’t need sophisticated exploits. They wait.

The Theft Diagram

Here’s the fundamental difference between possession-based and continuity-based authorization.

Bearer Token (OAuth/JWT) — “Do you have it?”

In the basic bearer token model, an attacker who intercepts the token in transit gains full access. The Trust Plane validates the token’s structure and checks whether the bearer has permission—both checks pass because the attacker possesses a valid token. The action executes successfully.

(Yes, sender-constrained tokens like DPoP and mTLS exist. They bind tokens to specific clients—but they don’t bind tokens to specific transactions or enforce attenuation across delegation hops. The problem isn’t token theft alone; it’s authority that doesn’t narrow.)

sequenceDiagram
    participant A as Agent A
    participant X as Attacker
    participant B as Agent B
    participant TP as Trust Plane

    A->>B: [token]
    Note over X: Intercepts token
    X->>TP: [token]
    TP-->>TP: Token valid? ✓
Bearer present? ✓
    TP-->>X: ACTION EXECUTED ⚠️

Capability Chain — “Are you the continuation?”

In the capability model, interception is useless. The attacker intercepts the chain and presents it to the Trust Plane. The Trust Plane validates the chain, identifies Agent B as the designated executor, and demands cryptographic proof. When the attacker cannot produce Agent B’s signature, the transaction is rejected—even though the chain itself is valid.

sequenceDiagram
    participant A as Agent A
    participant X as Attacker
    participant B as Agent B
    participant TP as Trust Plane

    A->>B: [chain]
    Note over X: Intercepts chain
    X->>TP: [chain] + bad signature
    TP-->>TP: Chain valid? ✓
Designated: B
Signed by B? ✗
    TP--xX: REJECTED ✓

The attacker intercepts the chain (all signatures, all facts). They present it to the Trust Plane. The Trust Plane validates the chain, identifies the designated executor, and demands cryptographic proof that the requester IS that executor.

The attacker cannot sign with Agent B’s private key. Interception is useless. There’s nothing to steal—the chain doesn’t contain authority you can possess. It establishes who continues the transaction.

(This assumes the attacker cannot compromise Agent B itself. If an attacker can induce the legitimate agent to sign malicious requests—via prompt injection or session smuggling—then tight constraints become the last line of defense.)

Toward Capability-Based Authorization

Okta mentions emerging token formats—Macaroons, Biscuits, Wafers—that reflect the architecture delegation chains demand. Each token is “baked” with core attributes: identity, expiry, and a cryptographic root.

This is the right direction. But token formats alone don’t solve the problem. You need infrastructure that:

Enforces attenuation structurally: Each delegation hop must narrow permissions. Not by policy (which can be bypassed), but by cryptographic structure (which cannot).

Proves continuity, not possession: Traditional security asks “do you have a valid token?” Capability security asks “are you the designated executor in this transaction chain?”

Binds constraints to execution context: When an agent invokes an API, constraints like amount <= 500 should be verified at the Trust Plane against the cryptographic chain—not against an if-statement in the agent’s code.

Survives async boundaries: Agents communicate through message queues, event streams, and cross-organizational APIs. Authorization must travel with the transaction without exposing replay-able credentials.

The Superset, Not the Alternative

A common misunderstanding: capability-based authorization competes with RBAC and ABAC. It doesn’t. It contains them.

RBAC expressed as capability:

Traditional RBAC:
  "Users with role:finance can access resource:payments"

Capability equivalent:
  grant(payments, constraints={role: "finance"})

The role is a constraint. The capability carries it.

ABAC expressed as capability:

Traditional ABAC:
  "If user.department == 'finance' AND time.hour in 9-17 AND amount < 10000, allow"

Capability equivalent:
  grant(payments, constraints={
    department: "finance",
    time_window: "09:00-17:00",
    max_amount: 10000
  })

The attribute checks are constraints. The capability carries them.

Every RBAC policy, every ABAC rule, every conditional access evaluation can be expressed as a constraint on a capability token. The mapping is mechanical. Nothing is lost.

What’s gained is extensibility.

RBAC and ABAC evaluate policies at a central decision point. When Agent A delegates to Agent B, the policy server checks Agent B’s roles and attributes. This works for single-hop authorization.

But when Agent B delegates to Agent C? The policy server checks Agent C. It has no cryptographic proof that Agent C’s authority descends from Agent A’s original grant. It has no structural guarantee that permissions narrowed at each hop. It relies on policy configuration to prevent escalation—and policy can be misconfigured.

Capability chains don’t replace RBAC/ABAC evaluation. They extend it across delegation boundaries:

Scenario	RBAC/ABAC	Capability Chains
Single agent, single resource	✓ Policy evaluation	✓ Constraint verification
Single agent, multiple resources	✓ Role/attribute matching	✓ Scoped constraints
Agent-to-agent delegation	⚠️ New policy lookup, no provenance	✓ Chain extends, constraints accumulate
3+ hop delegation	✗ No cryptographic lineage	✓ Full provenance preserved
Cross-org verification	✗ Requires federated policy	✓ Chain self-verifies

The architectural choice isn’t “RBAC vs. capabilities.” It’s “RBAC alone vs. RBAC embedded in capabilities.”

(Capability chains can also reference external PDPs for dynamic policy evaluation. If policy updates are themselves constrained to only attenuate—tightening is instant, loosening falls back to the chain’s original policy—you preserve strict attenuation while gaining instant updates for the common case. The architecture is a spectrum, and the hybrid designs are better than they first appear.)

Start with one agent calling one API. Your RBAC policies become capability constraints. Nothing changes operationally.

Add agent-to-agent delegation. The capability chain extends. Constraints accumulate. No new architecture required.

Scale to cross-organizational workflows. The chain carries proof. Verification happens locally. No federated policy infrastructure needed.

This is why capability-based authorization isn’t a different approach—it’s the complete one. Solutions that stop at RBAC/ABAC work until they hit multi-hop delegation. Then they hit a wall. The architecture that starts with capabilities never hits that wall because delegation is the primitive, not an afterthought.

The Architecture That’s Coming

Today’s agents call pre-built APIs with JSON parameters. Tomorrow’s agents will write their own.

Research from Anthropic shows agents writing code (instead of JSON tool calls) achieve 98% token savings. Cloudflare shipped V8 isolates for sub-100ms ephemeral code execution. The industry is moving toward:

Agent-defined microservices: Agent A writes a procurement function and delegates execution rights to Agent B
Event-driven choreography: Agents communicate through queues, not direct calls
Cross-organizational workflows: Company A’s agent invokes Company B’s agent

This is where orchestration breaks and choreography becomes necessary. And choreography requires capability infrastructure.

Consider what happens when agents can mint their own functions:

Agent A (authorized for refunds ≤ $500):
  → Writes a bounded refund function
  → Deploys it as ephemeral microservice
  → Delegates invocation rights to Agent B
  → Agent B can call the function but cannot exceed $500

Without capability enforcement, this pattern is unsafe. What stops Agent A from writing a function that exceeds its own authority? What stops Agent B from modifying the function? How do you audit dynamically generated code?

The answer comes from capability-based operating systems like seL4. In seL4, components can mint endpoint capabilities for other components, badge those capabilities with constraints, and delegate them down the chain. The kernel enforces that capabilities only attenuate—a component cannot grant more authority than it possesses.

The analogy isn’t perfect—distributed systems lack a single trusted kernel, key management is harder across organizational boundaries, and revocation requires coordination that seL4 doesn’t face. But the semantic model transfers: capabilities as unforgeable tokens of authority that can only narrow as they propagate.

The same architecture applies to agents:

Capability minting: Agent A creates a capability token for its generated function
Badging: The token carries constraints inherited from Agent A’s own authority
Delegation: Agent B receives the badged capability, can invoke but not exceed
Enforcement: The Trust Plane verifies the chain before execution

This isn’t speculative. The primitives exist. What’s missing is the integration layer that applies seL4-style capability semantics to multi-agent systems—where agents are components, functions are endpoints, and the Trust Plane is the kernel.

The result: agents constrained by their initial capabilities, but free to build arbitrarily complex systems within those constraints. Novel emergent behavior becomes possible without novel security risks. Creativity flourishes inside the sandbox because the sandbox is mathematically guaranteed.

What Enterprises Need Now

Teams deploying multi-agent systems today face a choice:

Accept the gap: Deploy with identity-based security, acknowledge authorization drift, plan for breach recovery. This is how most organizations operate today.
Add policy middleware: Wrap tool calls in RBAC/ABAC enforcement, block dangerous actions at runtime. This works for single agents—but creates an architectural dead-end when you need agent-to-agent delegation. You’ll hit a wall and have to re-platform.
Over-engineer controls: Add human approval for every sensitive operation, accept the scalability cost. This is what OpenAI recommends.
Build on capability infrastructure: Start with the architecture that scales. Your RBAC policies become capability constraints. When you add agent-to-agent delegation, the same primitive extends. No re-platforming. No architectural dead ends.

Option 4 is what this series explores.

The primitives exist. Ed25519 signatures are fast and widely deployed. Capability-based security has decades of research and production precedent (seL4, AS/400). What’s missing is the integration layer—infrastructure that applies these primitives to multi-agent workflows.

Identity providers handle authentication well. Orchestration platforms handle execution well. The gap between them—cryptographic capability enforcement across delegation chains—is where Proof of Continuity fits.

This is the second post in a series on AI agent security. Next: Proof of Continuity—technical foundations for authorization that survives delegation. See also: 5 Ways Your AI Agents Will Get Hacked for specific attack patterns.