Skip to main content
← back to blog

OpenAI's Agent Playbook Has a Security-Shaped Hole

The #1 AI lab just told enterprises how to build agents. They forgot to explain how to secure them.

security openai agents analysis

OpenAI recently published “A Practical Guide to Building Agents,” a 32-page enterprise playbook distilling deployment lessons into actionable patterns. It’s thoughtful, well-structured, and genuinely useful for teams building their first agentic systems.

It also contains zero security model for credentials.

This isn’t an oversight. It’s a gap that reveals the fundamental challenge facing every team deploying AI agents today: we have sophisticated orchestration patterns without corresponding authorization infrastructure.

What the Guide Gets Right

OpenAI’s framework is elegant. Agents consist of three components:

  • Model: The LLM powering reasoning and decisions
  • Tools: External functions or APIs the agent invokes
  • Instructions: Guidelines defining agent behavior

The guide covers orchestration patterns that matter for production systems—single-agent loops, multi-agent manager hierarchies, decentralized handoffs between specialized agents. For teams evaluating whether agents fit their use case, it’s a solid starting point.

The Missing Security Model

Here’s what’s absent: any explanation of how agents authenticate to tools, how authorization boundaries are enforced, or how credentials transfer when one agent delegates to another.

The guide’s “guardrails” section lists six protection categories:

Guardrail TypeWhat It Does
Relevance classifierFlags off-topic queries
Safety classifierDetects jailbreaks
PII filterRedacts personal information
ModerationFlags harmful content
Tool safeguardsRisk ratings
Rules-basedRegex, blocklists

“Tool safeguards” is the closest thing to authorization guidance, and here’s the full recommendation:

“Assess the risk of each tool available to your agent by assigning a rating—low, medium, or high—based on factors like read-only vs. write access, reversibility, required account permissions, and financial impact.”

This is policy-based risk assessment with human escalation as the safety net. Not cryptographic enforcement. Not capability attenuation. A risk rating.

Why This Matters

Consider the multi-agent patterns OpenAI recommends.

Manager Pattern: A central agent delegates to specialized sub-agents via tool calls. When the Spanish Agent needs to call a translation API, where does authorization come from? The guide doesn’t say.

Decentralized Pattern: Agents hand off workflow execution to each other. Each handoff transfers “conversation state.” Not scoped credentials.

Both patterns work brilliantly for orchestration. Both break silently when you need authorization that survives multi-hop execution.

The Production Reality

This gap isn’t theoretical. Recent incidents demonstrate what happens when agent authorization fails:

  • Replit incident (July 2025): An AI agent erased 1,206 executive records from a live database in seconds. The agent had standing credentials and executed thousands of commands per minute.

  • Salesloft Drift breach (August 2025): Compromised OAuth tokens exposed 700+ organizations over 10 days. Tokens should have been revoked months earlier but persisted because authorization wasn’t lifecycle-aware.

Both incidents share a pattern: credentials that existed without corresponding authorization constraints. The agents had access. Nothing enforced limits.

Human Escalation Doesn’t Scale

OpenAI’s answer to high-risk operations is human-in-the-loop intervention:

“High-risk actions: Actions that are sensitive, irreversible, or have high stakes should trigger human oversight until confidence in the agent’s reliability grows.”

This recommendation acknowledges a real problem while proposing an unscalable solution. Traditional apps perform 50 operations per minute. AI agents execute 5,000. At agent velocity, human oversight becomes impossible—agents complete entire workflows, including violations, before anyone can act.

“Human oversight until confidence grows” doesn’t ship production agents. Mathematical guarantees do.

What Secure Agent Authorization Requires

The gap in OpenAI’s guide points to infrastructure that doesn’t exist in most organizations:

Credential scoping at the tool level: When an agent invokes a refund API, the constraint amount <= 500 should be cryptographically enforced at the gateway—not an if-statement the agent could bypass.

Capability attenuation across delegation: When Agent A delegates to Agent B, permissions must narrow. No sub-agent should accumulate more authority than its parent.

Authorization that survives async execution: In choreographed systems where agents communicate through message queues, credentials can’t be bearer tokens that anyone who intercepts them can replay.

Proof of continuity, not just possession: By the third delegation hop, traditional tokens still validate but have no cryptographic link to the initiating workflow. Authorization must prove the requester IS the legitimate continuation of a transaction.

The Industry Gap

OpenAI is building orchestration. Identity providers like Okta are building authentication. Nobody is building the authorization layer between them.

OpenAI’s guide explicitly identifies what tools can do (“read-only vs. write access, reversibility, required account permissions”) without providing infrastructure to enforce those boundaries. The missing piece is capability-based authorization with cryptographic enforcement that survives multi-hop, async, cross-organizational agent workflows.

What Comes Next

The timing window is narrow. Enterprise agent deployments are accelerating—46% of YC’s Spring 2025 batch was AI agent companies. Security requirements will emerge as deployments fail audits.

For teams building agents today, the practical guidance is:

  1. Treat OpenAI’s orchestration patterns as necessary but insufficient. Good architecture, incomplete security model.

  2. Don’t rely on human escalation for authorization. It doesn’t scale and creates audit gaps.

  3. Evaluate credential infrastructure before deploying multi-agent systems. The patterns that work beautifully for orchestration break silently when authorization matters.

  4. Look for cryptographic enforcement, not policy enforcement. If-statements don’t survive code-executing agents. Risk ratings don’t survive compromised systems.

OpenAI’s guide is valuable for what it teaches about agent orchestration. It’s equally valuable for what it reveals about the security infrastructure that doesn’t yet exist.


This is the first post in a series analyzing the emerging security landscape for AI agents. Next: a16z Calls ‘Know Your Agent’ a Big Idea for 2026—VCs identify the same gap. For specific attack patterns, see 5 Ways Your AI Agents Will Get Hacked.