Amla Labs Memo
The Agentic Authorization Memo
Why AI agents need kernels, not guardrails—and how structural isolation with cryptographic capability chains provides it.
December 2025 · ~30 min read
Preface
AI agents are no longer chatbots. They're autonomous systems that take real actions: issuing refunds, processing payouts, adjusting database records, delegating to other agents. This shift from conversation to action creates a security problem that existing infrastructure wasn't designed to solve.
By the late 1960s, operating systems like Multics tied each process to its own virtual address space. Other processes' memory wasn't hidden—it was unaddressable. The hardware memory management unit made it impossible to even name another process's memory. Agents need the same treatment: not behavioral guardrails that hope for compliance, but structural isolation that makes "bad" impossible to express.
Identity providers (Okta, Auth0) answer "who is this agent?" Orchestration platforms (OpenAI's Agents SDK, LangChain) answer "what should this agent do?" But neither answers the critical question for autonomous systems: "Is this specific transaction the authorized continuation of a legitimate workflow, with constraints that accumulated at each delegation hop—and with context scoped to prevent cross-transaction leakage?"
This memo argues that the gap isn't just unaddressed—it's structural. Traditional authorization assumes human review cadence. Agents operate at machine speed—completing workflows in seconds that would take humans hours. Traditional tokens carry identity. Agent workflows require provenance. And critically: the real isolation boundary isn't the user or the tenant—it's the transaction.
Terms (So We Mean the Same Thing)
- Transaction: the unit of authorization and isolation. It starts when a user- or system-initiated action is admitted and ends when its effects commit.
- Context: the scoped working set bound to a transaction (capabilities, memory partitions, designated inputs).
- Session: the user interaction channel that may span multiple transactions.
- Workflow/Task: the logical business process that can include multiple transactions.
Amla builds the missing layer: an agent kernel that enforces isolation structurally, with cryptographic capability chains that prove continuity across multi-hop delegation, attenuate permissions monotonically, scope context to transactions, and provide signed receipts for every action.
The Problem
The Authorization Gap
OpenAI's 32-page enterprise guide "A Practical Guide to Building Agents" is thoughtful, well-structured, and genuinely useful. It contains zero security model for credentials.
Their "guardrails" section lists six protection categories:
| Guardrail Type | What It Does |
|---|---|
| Relevance classifier | Flags off-topic queries |
| Safety classifier | Detects jailbreaks |
| PII filter | Redacts personal information |
| Moderation | Flags harmful content |
| Tool safeguards | Risk ratings |
| Rules-based | Regex, blocklists |
"Tool safeguards" is the closest thing to authorization guidance, and the full recommendation is: assign risk ratings and escalate high-risk actions to humans.
This is policy-based risk assessment with human escalation as the safety net. Not cryptographic enforcement. Not capability attenuation. A risk rating.
Why Human Escalation Doesn't Scale
AI agents execute orders of magnitude faster than human review cycles—completing entire workflows in seconds that would take humans hours.
OpenAI's answer to high-risk operations is human-in-the-loop intervention: "High-risk actions should trigger human oversight until confidence in the agent's reliability grows."
At agent velocity, human oversight becomes impossible. Agents complete entire workflows—including violations—before anyone can act. "Human oversight until confidence grows" doesn't ship production agents. Mathematical guarantees do.
The Third-Hop Problem
Okta's seven-part series on AI agent security (link) maps the problem precisely. From their delegation chain analysis:
By the third delegation hop, there is no cryptographic link to the initiating agent or user. Without cryptographic proof, malicious agents can forge delegation claims and access resources they shouldn't reach.
And:
OAuth tokens validate structure and status but lack historical traceability.
OAuth doesn't prove provenance. By the third delegation hop, you know who is making a request. You don't know why they have the authority to make it, what constraints should apply, or whether this request is a legitimate continuation of the original workflow.
Two Confused Deputies
The confused deputy problem (Hardy, 1988) describes a deputy with authority the requestor lacks exercising that authority for the wrong principal. What's less discussed: agents are confused deputies in two distinct ways.
| Layer | Confused Deputy Type | What's Misapplied | The Failure |
|---|---|---|---|
| Effects | Authority | Wrong principal's capability | Agent uses Alice's credentials while processing Bob's transaction |
| Inputs | Information | Wrong principal's data | Agent retrieves Alice's data while assembling Bob's context |
Current security guidance focuses on the authority confused deputy—credential management, tool permissions, human escalation. But the information confused deputy is equally dangerous: agents can retrieve semantically-similar chunks from shared memory stores, leaking data across tenants or sessions without any credential misuse.
Both share the same structural shape: an untrusted executor sits at a confluence of streams. If it can choose which stream to consult, it can misapply it. The fix is the same for both: bind capabilities AND context to transactions, not agents. The agent doesn't select which credential to use or which memory to search—the transaction designates both.
Real Incidents
Replit (July 2025): An AI agent erased 1,206 executive records from a live database in seconds. The agent had standing credentials and operated with unrestricted access to production. The credentials were valid—nothing enforced limits. (Authority confused deputy.)
Salesloft Drift (August 2025): Compromised OAuth tokens exposed 700+ organizations over 10 days. Tokens originated from a GitHub compromise months earlier but persisted because authorization wasn't lifecycle-aware. (Authority confused deputy.)
Cross-session memory leaks: Giskard formalizes "Cross Session Leak" as a distinct vulnerability class—sensitive info bleeding across sessions due to shared caches or poorly scoped context. (Information confused deputy.)
The pattern: credentials or context that exist without corresponding transaction-level scoping.
Why Now
The Agent Explosion
Nearly 50% of Y Combinator's S25 batch was AI agent companies. Deloitte reports 25% of enterprises using GenAI have deployed agents, with 50% expected by year-end 2025.
Google, MIT, and DeepMind's research on multi-agent systems (" Towards a Science of Scaling Agent Systems ") found that uncoordinated agents amplify errors 17x. The research tested 180 configurations across 5 architectures and 3 LLM families. Independent multi-agent systems showed error amplification up to 17x compared to single-agent baselines.
Foundation Capital's “System of Agents” thesis (Oct 2024) frames this as a $4.6T shift from software-as-a-tool to service-as-software: agents sit where data is born, orchestrate work, and create their own traces. It reinforces that the execution path is the control point—and the asset—because that's where decision context can be captured (or enforced).
Foundation Capital's follow-up on agent reliability (Nov 2025) quantifies the “extra nines” problem: 99% per-step accuracy yields ~37% success over 100 steps; 99.9% yields ~90%. That's statistical reliability. We pair it with structural reliability: capability chains so the 0.1% failure case stays within cryptographic scope, not ambient authority.
Their “99% step-length” framing maps directly to Proof of Continuity. Foundation measures outcome horizon length—how far an agent can go before a human intervenes. Amla enforces authorized horizon length—how far a transaction can propagate before it exhausts its scoped, signed capabilities. Every extra “nine” of cognition still needs the kernel/Trust Plane to keep the rare failure inside bounds and leave a signed receipt.
Multi-agent deployments are happening. Security infrastructure for them is not.
Multi-Model Reality (Commoditization + Specialization)
Enterprises already route across multiple models to balance cost and specialization. Surveys and vendor launches show multi-model is default, not future:
- OpenAI + Anthropic + Gemini in one stack: SAP Generative AI Hub markets “one API to multiple leading models” (OpenAI, Anthropic, Gemini, etc.) inside SAP BTP (SAP Generative AI Hub).
- Vendor-neutral routing: ServiceNow “Model Provider Flexibility” routes across providers via AI Control Tower (ServiceNow AI Control Tower).
- Salesforce partner models: Agentforce partners with OpenAI (GPT-5) and Anthropic Claude, governed by Einstein Trust Layer (Salesforce announcement).
- Cost/specialization reasons: Industry surveys (e.g., LangChain State of Agent Engineering) report 75%+ of production teams use multiple models to mix quality, latency, and price points.
Commoditization means model-provider identity is not the trust anchor. Enterprises need a consistent authorization/audit plane across model hops, because the model gateway key is not “Alice the support rep.”
Regulatory Pressure
The EU AI Act requires "human oversight" for high-risk AI systems—not per-action approval, but the ability to understand and audit autonomous decisions.
Financial services (Visa, Mastercard frameworks), healthcare (HIPAA audit requirements), and emerging regulations are converging on a common requirement: Know Your Agent (KYA). Verifiable identity, provable authorization chains, and complete audit trails for every autonomous action.
This isn't speculative compliance. It's current regulatory direction that enterprises will need to demonstrate.
The Architecture Shift
Today's agents call pre-built APIs with JSON parameters. Tomorrow's agents will write their own.
Research from Anthropic shows agents writing code (instead of JSON tool calls) achieve 98% token savings. Cloudflare shipped V8 isolates for sub-100ms ephemeral code execution. The industry is moving toward:
- Agent-defined microservices: Agent A writes a function and delegates execution rights to Agent B
- Event-driven choreography: Agents communicate through queues, not direct calls
- Cross-organizational workflows: Company A's agent invokes Company B's agent
In this world, authorization cannot be "check the policy server." Authorization must travel with the transaction, survive async boundaries, and prove provenance at execution time.
Infrastructure Is Catching Up
At KubeCon NA 2025, Google announced Agent Sandbox—a Kubernetes primitive built specifically for agent isolation:
Providing kernel-level isolation for agents that execute code and commands is non-negotiable.
The key design choice: isolation per task, not per user or per session:
Agentic code execution and computer use require an isolated sandbox to be provisioned for each task.
Google's "task" maps to our "transaction"—both represent the per-invocation isolation boundary. Agent Sandbox provides the infrastructure layer (gVisor sandbox with optional Kata hardware isolation; OS-enforced boundary). The authorization layer that binds capabilities and memory to that boundary is what we're building.
The market validation is clear: enterprises need per-transaction isolation, and infrastructure providers are shipping primitives for it. What's missing is the cryptographic authorization layer that makes delegation chains verifiable.
Multi-Model Control Planes Are Already Shipping
Enterprises are already routing across multiple models behind a single governance layer. The pattern exists because identity/authorization/audit don't naturally propagate across model/provider hops.
- Salesforce Agentforce + Einstein Trust Layer: Agentforce runs partner models (OpenAI, Anthropic) and wraps them with the Einstein Trust Layer for data masking, audit, and governance in regulated industries.
- ServiceNow AI Control Tower: Model Provider Flexibility and centralized policies/routes across models via AI Control Tower to keep identity, permissions, and audit consistent.
- SAP Generative AI Hub: A single API to multiple leading models (OpenAI, Anthropic, Gemini, etc.) within SAP BTP with RBAC/authorizations for access, per SAP Generative AI Hub.
Why it hurts authorization: When you route OpenAI ↔ Anthropic (or internal ↔ external models), you have multiple inference hops producing instructions. Provider calls are made with a gateway key, so the provider sees "the gateway," not "Alice the support rep." You still need internal continuity: under whose authority did this tool call execute, and what constraints were in force across hops? That's the gap these control planes are trying to cover.
Concrete deployment shape (Salesforce refund/concession agent):
- User (support rep) triggers an agent from Salesforce UI/Slack/ChatGPT integration.
- Agentforce routes across models: Model A plans/reasons; Model B handles compliance/regulated workloads (Salesforce positions Claude for regulated sectors).
- System pulls CRM context (customer record, ticket, entitlement, policy).
- Agent proposes an action: issue refund/credit, update case, etc.
- Einstein Trust Layer/governance must ensure scoped data access, audit trail, and that actions execute under the right principal (rep permissions, not a shared superuser key).
Buyer reality: Platform owners (Salesforce/ServiceNow/SAP admins, RevOps/IT) are the economic buyers; Security/GRC is the veto-holder when money or sensitive data is touched.
Real Multi-Agent Delegation (AWS Bedrock AgentCore)
AWS publishes multi-agent patterns in Amazon Bedrock's Agent Framework/AgentCore, including supervisor → specialist delegation samples (e.g., SRE assistant, network ops assistant). Reference: Amazon Bedrock Agents documentation (docs).
Example: multi-agent SRE assistant (supervisor → specialists):
- User reports an incident ("API latency is up").
- Supervisor agent plans and delegates to specialist agents (Kubernetes, logs, metrics, runbooks).
- Each specialist calls tools via a gateway/identity layer (Bedrock Agents use IAM roles and API endpoints to avoid hardcoded credentials).
- Supervisor aggregates results and produces an action plan.
Why authorization hurts here:
- Principal ambiguity: is the caller the human SRE, supervisor, or each sub-agent?
- Scope per sub-agent: the logs agent shouldn't have "restart deployment" authority.
- Continuity proof: how to show a tool call was the authorized continuation of this incident, not a replay from another thread.
AWS's inclusion of identity + gateway in the reference pattern is itself a signal: delegation across agents forces a continuity/auth problem that needs infrastructure. Economic buyer: Platform Engineering / SRE tooling; veto buyer: Security/IAM (because sub-agents touch prod APIs).
Technical Approach: The Agent Kernel
Structural Isolation, Not Behavioral Policy
Operating systems don't rely on processes being well-behaved. The MMU doesn't evaluate whether a memory access is "appropriate"—it checks whether the page is mapped. If it's not mapped, the access faults. No policy evaluation. No behavioral hope.
Agent security needs the same approach. The difference:
- Behavioral policy: "The agent shouldn't access Tenant B's data when processing Tenant A's transaction."
- Structural isolation: "When processing Tenant A's transaction, Tenant B's data doesn't exist in the agent's addressable memory."
Behavioral policy can fail—through bugs, prompt injection, or confused deputy errors. Structural isolation can't be bypassed because there's nothing to bypass. The data isn't there to leak.
Important positioning: the kernel prevents cross-scope exfiltration and cross-context confused deputy. It does not prevent an agent from doing harmful things that are genuinely authorized (burn budget, write garbage to its own partition, trigger legitimate-but-bad actions). The value is bounded blast radius, not perfect behavior.
The Agent Kernel Model
An agent kernel does for agents what an OS kernel does for processes:
- Isolates transactions—each transaction has its own context (capabilities, memory partition)
- Mediates all I/O—tool calls, memory access, delegation all go through the kernel
- Enforces policy cryptographically—the kernel decides what's allowed, not the agent
The LLM becomes untrusted userspace. It receives scoped context. It returns tool calls. The kernel verifies those calls against the capabilities bound to this transaction. The agent doesn't choose which credentials to use or which memory to search—the kernel binds them before the agent runs.
Context-scoped retrieval returns (chunk, proof_of_scope), where proof_of_scope is a signed statement binding context_id, namespace, and chunk_hash to the retrieval policy in force. It is cryptographic evidence that the chunk was eligible for this transaction—not a best-effort filter. This is our Trust Plane design; any equivalent cryptographic proof over context + policy works.
Cross-organizational verification relies on published roots and chain signatures; provenance can be checked without a callback. Revocation and status still require online state if you need real-time invalidation—offline audit does not remove the need for live revocation.
Defense in Depth: Crypto Core vs Broker
The kernel splits trust boundaries. A minimal crypto core enforces invariants; a larger broker handles wiring. This separation is what keeps the TCB small.
Crypto core must guarantee:
- Signature verification and chain integrity
- Monotonic attenuation (constraints only tighten)
- Use-count limits and replay protection
- Revocation enforcement
Broker can handle (non-core):
- Routing, transport, and I/O multiplexing
- Tool invocation wiring and sandbox policy enforcement
- Context assembly and caching
A broker bug can misroute or violate sandbox policy, but it cannot mint capabilities, expand authority, or bypass cryptographic invariants. That is the point of the split. This assumes all side-effectful I/O is mediated by the Trust Plane; any unmediated channel undermines the invariants.
Proof of Continuity: The Cryptographic Mechanism
Traditional authorization asks: "Do you have a valid token?"
Proof of Continuity asks: "Are you the designated continuation of this specific transaction?"
The difference is profound. Bearer tokens grant authority through possession. Capability chains establish authority through cryptographic lineage. Interception of a bearer token grants access. Interception of a capability chain is useless—the attacker cannot sign as the designated executor.
Definition
Proof of Continuity (PoC)
A cryptographic proof that an executor (an untrusted AI agent) provides to the Trust Plane, demonstrating:
- 1. Origin immutability: The transaction descends from a specific root authority, unchanged throughout the transaction
- 2. Authority monotonicity: Each delegation hop attenuated authority—permissions can only narrow, never expand
- 3. Executor binding: The caller is the designated executor for the current hop, proven via cryptographic signature
- 4. Causal provenance: A verifiable chain exists from the original authority through every intermediate hop to the current transaction
The Trust Plane validates this proof before executing any action. The executor cannot self-assert authority—it must demonstrate continuity from a trusted origin.
Prior art: Capability-based authorization isn't new. Google's Macaroons (2014), Biscuit tokens, UCAN, and decades of object-capability research established these primitives. Our contribution is applying them specifically to AI agent workflows—where delegation happens at machine speed, constraints must encode business logic, and audit trails need to satisfy emerging KYA requirements.1
1 Nicola Gallo (2025): "PIC Model — Provenance Identity Continuity for Distributed Execution Systems" — Formal treatment of Proof of Continuity as a solution.
How Capability Chains Work
A capability chain is an append-only sequence of cryptographically signed blocks. Each block designates the next executor, adds constraints, and is signed by the Trust Plane—a verification gateway that enforces constraints and holds real credentials.
Block 0 (Root) — signed by Trust Plane
├── capabilities: [refunds, payments]
├── constraints: {}
├── designated_executor: Agent_A_pubkey
│
Block 1 — Agent A requests delegation
├── constraints: amount ≤ $500, customer = "cus_123"
├── designated_executor: Agent_B_pubkey
│
Block 2 — Agent B requests execution
├── action: stripe.refund
├── amount: $75
└── Verification: ✓ chain valid, ✓ constraints satisfied, ✓ signed by BFork/Join Semantics
Delegation chains are not always linear. A single executor can fork to multiple successors, and later those branches can re-join. The Trust Plane tracks the active head set for each root and only allows joins that consume active heads.
PCA0 (root)
└── PCA1 (A)
├── PCA2 (B)
└── PCA3 (C)
└── join → PCA4 (D)Forks create multiple active heads; joins consume the active head set and emit a single continuation. This keeps continuity auditable and prevents silent, untracked divergence in delegated authority.
The critical properties:
- Attenuation is structural: Each block can only narrow permissions. Block 1 cannot grant capabilities Block 0 didn't have. This is enforced cryptographically—the Trust Plane won't sign blocks that expand authority.
- Continuity is proven: When Agent B invokes an action, they must sign the transaction with their private key. The Trust Plane verifies the signature matches the designated executor in the chain. Possession of the chain isn't enough—you must prove you ARE the designated continuation.
- Constraints are enforced at execution: The Trust Plane evaluates constraints against the actual transaction parameters before execution. "amount ≤ $500" isn't an if-statement in agent code—it's a cryptographic check at the gateway.
Context Isolation
The kernel enforces a primary invariant: each transaction runs in isolation, with memory scoped to that invocation's designated context.
This is stricter than per-user or per-tenant isolation. When a transaction arrives, the kernel creates an isolated execution context—a runtime environment with bound capabilities and partitioned memory:
Transaction_A (Alice's Transaction 1):
context_id: ctx_txn_001
capabilities: [Cap_A bound to opaque tool handles]
memory_partition: tenant_acme/user_alice/transaction_001/*
Transaction_B (Alice's Transaction 2):
context_id: ctx_txn_002
capabilities: [Cap_B bound to opaque tool handles]
memory_partition: tenant_acme/user_alice/transaction_002/*
inherited_context: [] # Fresh start unless explicitly mappedTransaction 2 cannot access Transaction 1's memory partition—even though it's the same user. Prior context is only available if explicitly mapped into the new transaction's scope. This prevents:
- Cross-session leakage: Sensitive data from one conversation can't semantically bleed into another
- Accumulation attacks: Poisoned content in Transaction 1 can't corrupt Transaction 2
- Persistent manipulation: SpAIware-style attacks that rely on long-lived memory writes
The agent processing Alice's first transaction literally cannot name her second transaction's partition. It's not filtered—it's unaddressable. Same guarantee processes get: fresh address space per invocation, even for the same user.
This Isn't Far Away—It's Shipping
VC feedback often asks if this is "too early." The reality: multi-agent delegation and multi-model routing are already in production (see Salesforce/ServiceNow/SAP examples above, AWS Bedrock AgentCore patterns). The pain is present today; the missing piece is the authorization/continuity layer we're building.
Why Interception Becomes Useless
Consider an attack scenario:
- The Trust Plane sends a capability chain to Agent B
- An attacker intercepts the chain in transit
- The attacker presents the chain to the Trust Plane
The Trust Plane validates the chain—it's cryptographically valid. The Trust Plane identifies Agent B as the designated executor. The Trust Plane demands cryptographic proof: sign this transaction.
The attacker cannot sign. They have the chain but not Agent B's private key. The transaction is rejected.
Important caveat: This protection assumes the attacker cannot compromise the designated agent itself. If an attacker can induce Agent B to sign malicious transactions—via prompt injection, session smuggling, or other agent-level attacks—then the signature verification passes. This is why tight constraints matter: even if the agent is tricked into signing, the Trust Plane enforces the bounds.
What We Build
Three components that implement the agent kernel model:
Amla SDK
Client libraries for Python and TypeScript that make secure defaults easy.
- Generate capability chains with constraints (RBAC/ABAC map directly)
- Wrap agent actions for kernel verification
- Context-scoped memory access (retrieval returns signed proof-of-scope)
- Works with any agent framework (LangChain, CrewAI, OpenAI Agents SDK)
import { Amla } from "@amla/sdk";
const amla = new Amla({ apiKey: process.env.AMLA_KEY });
// Start a secure chain for a decision
const chain = await amla.chains.begin({
decision_id: "C_482193",
max_amount: 100,
ttl: "5m"
});
// Trust Plane verifies chain, then executes
const result = await amla.actions.execute({
chain_id: chain.id,
actor: "support_bot",
intent: "refund",
connector: "stripe.refund",
amount: 75
});
// result.receipt -> signed, auditable recordAmla Trust Plane (The Kernel Runtime)
The enforcement point that implements kernel semantics.
- Creates isolated execution contexts per transaction
- Verifies chain signatures and constraint satisfaction
- Enforces attenuation (rejects authority expansion)
- Partitions memory by transaction (context-scoped retrieval)
- Holds real credentials (agents never see raw API keys)
- Executes actions only after verification passes
The Trust Plane is the kernel. Agents present chains and invoke actions. The kernel verifies capabilities, scopes context, and enforces constraints before anything touches Stripe, your database, or your core systems. The agent is untrusted userspace—it can't choose which credentials to use or which memory to search.
Trust Assumptions
The Trust Plane is explicitly trusted infrastructure:
- • Is it trusted? Yes. The Trust Plane is the root of authority for all capability chains.
- • Is compromise catastrophic? Yes. If the Trust Plane is compromised, all security guarantees are void.
- • Blast radius? Similar to IAM providers or secret managers. This is infrastructure that requires hardened deployment.
We don't pretend this is "trustless." The Trust Plane is a policy enforcement point that must be secured accordingly—with HSM-backed keys, audit logging, and operational controls appropriate to the sensitivity of the actions it gates.
Amla Observability
Audit trails and compliance console.
- Signed receipts for every action (cryptographic proof, not just logs)
- Per-workflow chain viewer (see delegation path and constraints)
- Reports for ops, risk, and compliance teams
- Query: "who authorized what, under which constraints, and when?"
The authorization chain provides strong audit evidence—though full compliance also requires operational controls around key management, incident response, and monitoring.
Competitive Landscape
What Identity Providers Do
Okta, Auth0, SPIFFE: Solve "who is this agent?" with authentication, lifecycle management, and policy-based access control. Modern IAM evaluates contextual signals, dynamic conditions, and delegated permissions.
What they don't do: Prove that a specific request is the authorized continuation of a specific workflow with constraints that accumulated at each hop. Identity providers verify the entity. Capability infrastructure verifies the lineage.
What Orchestration Platforms Do
OpenAI Agents SDK, LangChain, CrewAI: Solve "what should this agent do?" with workflow management, task routing, and multi-agent coordination patterns.
What they don't do: Enforce authorization that survives multi-hop delegation. The patterns that work for orchestration break when authorization matters.
What Traditional Auth Does
OAuth, JWT, API keys: Prove identity and carry claims. Bearer tokens grant authority through possession.
What they don't do:
- Attenuate structurally (delegation can expand authority)
- Prove provenance (no cryptographic link to initiating workflow)
- Bind constraints to execution context (policies are advisory)
- Survive async boundaries without replay risk
The Gap
| Requirement | IAM | Orchestration | Policy Middleware | OAuth | Amla |
|---|---|---|---|---|---|
| Agent identity | ✓ | — | ✓ | ✓ | ✓ |
| Workflow orchestration | — | ✓ | — | — | —* |
| Single-hop authorization | ✓ | — | ✓ | ✓ | ✓ |
| Multi-hop provenance | — | — | — | — | ✓ |
| Structural attenuation | — | — | — | — | ✓ |
| Constraint enforcement | Policy | — | Policy | — | Crypto |
| Cross-org verification | Federation | — | Federation | — | ✓ |
*Amla integrates with existing orchestration platforms rather than replacing them.
We don't compete with Okta for identity or LangChain for orchestration. We provide the missing layer between them.
Evolving landscape: This table is simplified. Some orchestration platforms are adding tool-level permissions. Major providers could integrate capability-based auth natively. Secret managers already mitigate raw key exposure. The question is whether authorization for agent workflows becomes a feature or infrastructure—we're betting on infrastructure.
The Superset, Not the Alternative
A common misconception frames capability-based authorization as an alternative to RBAC and ABAC. It isn't. It's a superset that contains them.
RBAC maps directly to capability constraints:
Traditional RBAC:
"Agents with role:support can access resource:refunds"
Capability equivalent:
grant(refunds, constraints={role: "support"})ABAC maps directly to capability constraints:
Traditional ABAC:
"If agent.department == 'finance' AND amount < 10000 AND time.hour in 9-17, allow"
Capability equivalent:
grant(payments, constraints={
department: "finance",
max_amount: 10000,
time_window: "09:00-17:00"
})Every RBAC policy, every ABAC rule, every conditional access check can be expressed as a constraint on a capability token. The mapping is mechanical. Nothing is lost.
What's gained is delegation.
RBAC and ABAC evaluate policies at a central decision point. When Agent A calls an API, the policy server checks Agent A's roles and attributes. This works.
When Agent A delegates to Agent B? The policy server checks Agent B. It works.
When Agent B delegates to Agent C? The policy server checks Agent C. But now there's no cryptographic proof that Agent C's authority descends from Agent A's original grant. No structural guarantee that permissions narrowed at each hop. The policy server relies on configuration to prevent escalation—and configuration can be wrong.
This is the architectural boundary. Policy-based systems work for single-hop authorization. They don't extend to multi-hop without bolting on provenance mechanisms that capability chains provide natively.
| Deployment Stage | Policy Middleware | Capability Chains |
|---|---|---|
| One agent, one API | ✓ Works | ✓ Works |
| One agent, many APIs | ✓ RBAC/ABAC evaluation | ✓ Scoped constraints |
| Agent delegates to agent | ⚠️ New policy lookup, no provenance | ✓ Chain extends naturally |
| 3+ hop delegation | ✗ No cryptographic lineage | ✓ Provenance preserved |
| Cross-org verification | ✗ Requires federated policy infrastructure | ✓ Chain self-verifies |
Start where you are. Scale to where you're going.
Teams adopting Amla for single-agent deployments use the same RBAC/ABAC mental models they already know—roles and attributes become constraints. When multi-agent delegation becomes necessary, the architecture already supports it. When cross-org workflows emerge, the primitives are there. No re-platforming. No architectural dead-ends.
Policy middleware optimizes for today's single-agent deployments. Capability infrastructure optimizes for the multi-agent future that's already arriving.
The spectrum: Capability chains can also reference external PDPs for dynamic policy evaluation. A key insight: if policy updates are themselves constrained to only attenuate (tightening is instant, loosening falls back to the chain's original policy), you preserve strict attenuation while gaining instant updates for the common case. The architecture is a spectrum—pure capabilities on one end, pure RBAC/ABAC on the other—but the hybrid designs are better than they first appear: you don't necessarily sacrifice attenuation guarantees for dynamic policy.
Use Cases
Insurance Claims Processing
An intake agent receives a claim. It delegates to a damage assessment agent with constraints: this claim only, read-only access to policy data. The assessment agent delegates to a payout agent with tighter constraints: this claim, verified assessment, maximum $10,000 minus deductible.
Each delegation narrows authority. The payout agent can execute within bounds—it cannot access other claims, exceed the assessed amount, or bypass the deductible. The chain provides complete audit trail for regulators.
Healthcare Cross-Organization
Kaiser's referral agent needs to book a specialist at Stanford. Traditional approach: fax machines, phone calls, 3-5 business days.
With capability chains: Kaiser's agent creates a chain with patient authorization, coverage verification, and referral constraints. Stanford's scheduling system verifies the chain—no callback to Kaiser required. The chain proves: which patient, who approved, what constraints apply.
Trust bootstrap note: Cross-org requires Stanford to decide which Kaiser root keys to accept. Capability chains don't solve trust bootstrap—they make authorization after trust is established verifiable and constrained.
Enterprise Refund Workflows
A support agent handles a refund transaction. Company policy: refunds under $500 are auto-approved, over $500 require manager approval. The capability chain encodes this:
- Support agent: authorized for refunds, amount ≤ $500
- Stripe connector: verifies chain, executes refund
- Receipt: cryptographic proof of authorization path
The $500 limit isn't an if-statement in the agent's code. It's a constraint verified at the Trust Plane before the refund executes.
Risks & Limitations
Intellectual honesty requires acknowledging what we don't solve and where we might fail.
What We Don't Solve
Agent-level compromise: If an attacker can induce the legitimate agent to sign malicious transactions (prompt injection, session smuggling), signature verification passes. Tight constraints are the last line of defense, but they cannot prevent an agent from acting maliciously within its authorized bounds.
Trust bootstrap: Cross-organizational delegation requires deciding which root keys to trust. This is a federation and governance problem, not a cryptographic one. We make authorization after trust verifiable; we don't solve which organizations to trust.
Operational controls: Cryptographic authorization provides strong evidence, but compliance frameworks evaluate controls over systems. Key management, incident response, monitoring, and change management are still required.
Market Risks
Adoption timing: If enterprises delay serious agent deployments, the market for authorization infrastructure shrinks. Our thesis depends on agents taking real actions in production, not just demos.
Platform integration: If OpenAI, Anthropic, or major orchestration platforms build authorization natively, the standalone market narrows. We're betting that authorization is infrastructure, not a feature.
Regulatory uncertainty: If KYA requirements don't materialize or evolve differently than expected, the compliance driver weakens. We're betting that audit requirements for autonomous systems will increase, not decrease.
Technical Risks
Performance at scale: Cryptographic verification adds latency. At agent velocity, milliseconds matter. We're optimizing for sub-10ms verification, but real-world performance depends on deployment patterns.
Key management complexity: Per-agent keys, rotation, and compromise response add operational burden. We're building tooling to simplify this, but it's inherently more complex than bearer tokens.
Developer adoption: Capability-based security requires understanding attenuation and constraint design. If the learning curve is too steep, adoption suffers. Our SDK aims to make secure defaults easy, but developer experience is critical.
The Amla Thesis
We believe:
- Agents need kernels, not guardrails. Operating systems didn't get secure by writing better processes. They got secure by building kernels that don't require processes to be well-behaved. Agent security will follow the same path: structural isolation that makes "bad" impossible to express, not behavioral policies that hope for compliance.
- The real boundary is the transaction. Per-tenant isolation is SaaS 101. Per-transaction isolation is the architectural insight. Even for a single user, conversations accumulate—and sensitive data from one session can bleed into another if nothing enforces transaction-level scoping. Google's Agent Sandbox validates this at the infrastructure layer; we provide the authorization layer.
- Both confused deputies must be solved. Current guidance focuses on the authority confused deputy (credential management). The information confused deputy (memory leakage) is equally dangerous. A kernel that mediates both tool invocation AND context assembly is required.
- Cryptographic enforcement beats policy enforcement. At agent velocity, policy servers become bottlenecks and if-statements become bypass targets. Constraints that are mathematically guaranteed provide stronger security with less overhead. Capability chains make RBAC/ABAC delegation-safe—not as an alternative, but as a superset.
- Compliance becomes competitive advantage. As KYA requirements emerge, teams with queryable authorization chains will pass audits faster. "Show me who authorized what, with what constraints, through which delegation path" becomes trivial.
- The window is narrow. Enterprise agent deployments are accelerating. Security patterns established in the next 18 months will define the category. Building now positions us to set those patterns.
We're building the kernel layer that doesn't exist—infrastructure that makes multi-agent systems structurally isolated, cryptographically constrained, and provably auditable.
If you're deploying agents that take real actions and need authorization that survives delegation with context isolation, we'd like to talk.
Interested in working together?
We're working with early design partners to build the authorization layer for agentic systems.
Become a design partner