GitHub's Agentic Security: Right Problem, Incomplete Solution
GitHub's security principles minimize autonomy to minimize risk. But what if you could maximize autonomy within cryptographic bounds?
GitHub just published their agentic security principles for Copilot. Their threat model is correct. Their design philosophy reveals the tradeoff everyone is making.
We’ve built all of our hosted agents to maximize interpretability, minimize autonomy, and reduce anomalous behavior.
Minimize autonomy. That’s the current industry answer to agent security.
What GitHub Gets Right
Their threat model identifies three core risks:
- Data exfiltration — agents with internet access leaking sensitive context
- Impersonation and attribution — unclear responsibility when agents act
- Prompt injection — hidden directives manipulating agent behavior
All correct. These are the same attack vectors we’ve been documenting.
Their mitigations are sensible:
- Visible context — strip invisible Unicode and HTML before passing to agents
- Firewalling — limit external network access
- Minimal information — don’t give agents secrets they don’t need
- Human-in-the-loop — require approval for irreversible actions
- Clear attribution — co-commit with the initiating user
- Permission-based context — only gather from authorized users
This is defense-in-depth done well. For a hosted coding agent, these controls are appropriate.
The Tradeoff They’re Making
The limiting factor is the fourth principle: “preventing irreversible state changes” through mandatory human review.
The Copilot coding agent is only able to create pull requests; it is not able to commit directly to a default branch. Pull requests created by Copilot do not run CI automatically; a human user must validate the code and manually run GitHub Actions.
This works for code review. It doesn’t work for:
- Payment processing agents that need to execute transactions
- Customer service agents that need to issue refunds
- Healthcare agents that need to update records
- Supply chain agents that need to place orders
These workflows require agents to take consequential actions autonomously. The “human-in-the-loop for everything” pattern becomes the bottleneck it was meant to eliminate.
The Alternative: Bounded Autonomy
What if instead of minimizing autonomy, you constrained it cryptographically?
GitHub's Model:
Agent proposes → Human approves → Action executes
Capability Chain Model:
Human grants scoped capability → Agent executes within boundsThe difference: GitHub’s approach requires human judgment on every consequential action. Capability chains encode that judgment upfront—this agent can issue refunds up to $200 for this customer’s order—and then the agent operates autonomously within those constraints. The challenge shifts from reviewing every action to correctly scoping the initial capability grant—a tractable problem with better tooling.
Capability chains also address GitHub’s other threat vectors. Data exfiltration? The capability specifies which data the agent can access—even if prompt-injected, it can’t reach data outside its granted scope. Prompt injection? The attacker controls the agent’s intent, but not its authority. The gateway enforces constraints regardless of what the agent tries to do.
Attribution Without Bottlenecks
GitHub’s attribution model is instructive:
Pull requests created by the Copilot coding agent are co-committed by the user who initiated the action.
This creates an audit trail: action → agent → initiator. Good for accountability.
Capability chains provide the same audit trail without requiring human approval at execution time:
Capability Chain:
├─ Root: Gateway issues refund capability to Agent A
│ └─ constraints: max_amount=$500, customer=cus_123
├─ Block 2: Agent A delegates to Agent B
│ └─ constraints: max_amount=$200 (attenuated)
└─ Execution: Agent B issues $150 refund
└─ Signed proof of entire chainThe chain proves: who granted the original authority, what constraints accumulated, which agent executed, all cryptographically verified. This is the “traceability” that Article 14 of the EU AI Act requires.
Complementary, Not Competing
GitHub’s principles aren’t wrong—they’re appropriate for their use case. Coding agents can afford human-in-the-loop because code review is already a human process.
But the broader agent ecosystem needs a different answer. When agents cross organizational boundaries, when they process transactions, when they operate in regulated industries—“minimize autonomy” isn’t viable.
The question becomes: how do you give agents real autonomy while maintaining real security?
That’s what Proof of Continuity answers. Not “was a human watching?” but “was this action authorized by the cryptographic chain that preceded it?”
For the technical architecture, see Proof of Continuity. For attack vectors these controls address, see 5 Ways Your AI Agents Will Get Hacked.