Skip to main content
← back to blog

LangGrinch: A Bug in the Library, A Lesson for the Architecture

A critical CVE in LangChain shows why credential isolation matters more than perfect code.

security langchain cve credentials architecture

On December 23, LangChain published a critical advisory for CVE-2025-68664, nicknamed “LangGrinch.” The Cyata research team discovered a serialization bug in langchain-core that enabled secret exfiltration and arbitrary object instantiation.1 CVSS 9.3. Patch now if you haven’t.

The bug itself was straightforward: LangChain’s dumps() function didn’t escape user-controlled dictionaries containing a reserved lc key. When that data later passed through loads(), the deserializer treated attacker-shaped data as internal framework structure.

This is a coding mistake. Missing input escaping.

But the severity—environment variable exfiltration, including API keys and secrets—came from something else entirely: the framework had ambient access to credentials.

The Exfiltration Path

Here’s the code path that made this bug catastrophic:2

if (
    value.get("lc") == 1
    and value.get("type") == "secret"
    and value.get("id") is not None
):
    [key] = value["id"]
    if key in self.secrets_map:
        return self.secrets_map[key]
    if self.secrets_from_env and key in os.environ and os.environ[key]:
        return os.environ[key]  # <-- Returns any env variable
    return None

The secrets_from_env parameter was True by default. Any deserialized object could resolve environment variables. Combined with the serialization bug, an attacker could:

  1. Inject a crafted payload via prompt injection (shaping additional_kwargs or response_metadata)
  2. Let normal framework operations serialize and deserialize that payload
  3. Instantiate ChatBedrockConverse, which makes an HTTP request on construction
  4. Populate an HTTP header with any environment variable
  5. Exfiltrate secrets to an attacker-controlled endpoint

The attacker never directly accessed secrets. They tricked the framework into accessing secrets on their behalf.

The Bug vs. The Blast Radius

The bug was the missing escaping. That’s fixable. LangChain fixed it.

The blast radius came from architecture.

The serialization bug was the entry point. But an entry point into what? Into a layer that could resolve OPENAI_API_KEY, AWS_SECRET_ACCESS_KEY, database passwords—anything in the environment.

If that layer had no access to secrets, the bug would still exist. Object instantiation might still happen. But the attacker would exfiltrate… nothing.

The Kernel Analogy

Operating systems learned this lesson decades ago.

OS Security Model:
┌─────────────────────────────────────┐
│  User Space (untrusted)             │
│  - Applications                     │
│  - User processes                   │
│  - Potentially compromised code     │
├─────────────────────────────────────┤
│  Kernel Boundary                    │
│  - Validates all requests           │
│  - Mediates resource access         │
├─────────────────────────────────────┤
│  Kernel Space (privileged)          │
│  - Hardware access                  │
│  - Memory management                │
│  - Credentials / keys               │
└─────────────────────────────────────┘

A bug in a user-space application can crash that application. It cannot directly access kernel memory or hardware because a boundary enforces separation. The application must make a syscall, and the kernel validates before granting access.

Now consider the LangChain architecture:

LangChain Architecture (pre-patch):
┌─────────────────────────────────────┐
│  LLM Output (untrusted)             │
│  - additional_kwargs                │
│  - response_metadata                │
│  - Tool outputs                     │
│  - Prompt injection surface         │
├──────────── no boundary ────────────┤
│  Framework Layer                    │
│  - Serialization/deserialization    │
│  - Event streaming                  │
│  - Message history                  │
│  - os.environ access ← SECRETS HERE │
└─────────────────────────────────────┘

LLM output flows directly into framework internals that have ambient access to credentials. No kernel. No boundary. No mediation.

When the deserialization got confused by attacker-shaped data, secrets were right there to steal.

What Credential Isolation Looks Like

The alternative architecture separates the agent layer from credential access:

Isolated Architecture:
┌─────────────────────────────────────┐
│  Agent Layer (untrusted)            │
│  - LLM interactions                 │
│  - Framework code                   │
│  - Serialization/deserialization    │
│  - NO credentials in env            │
├─────────────────────────────────────┤
│  Gateway (security boundary)        │
│  - Validates capability tokens      │
│  - Checks authorization scope       │
│  - Mediates all sensitive ops       │
├─────────────────────────────────────┤
│  Credential Store (privileged)      │
│  - API keys                         │
│  - Database passwords               │
│  - Service accounts                 │
└─────────────────────────────────────┘

In this model:

  • The agent layer has no environment variables containing secrets
  • secrets_from_env wouldn’t work because there are no secrets in the env
  • To access a credential, the agent presents a capability token to the gateway
  • The gateway verifies the token’s scope before retrieving the credential
  • Even if the serialization bug happens, the attacker exfiltrates nothing

The bug still exists. The fix is still needed. But the blast radius collapses from “exfiltrate any secret” to “instantiate objects with no sensitive access.”

LangChain’s Response

LangChain’s patch introduced several breaking changes:1

  1. Fixed the bug: Added escaping for lc keys in user-controlled data
  2. Restricted deserialization: New allowed_objects parameter defaults to 'core'
  3. Disabled ambient secrets: secrets_from_env now defaults to False
  4. Blocked templates: Jinja2 templates now blocked by default

The third change is architecturally significant. It’s an acknowledgment that ambient secret access at the framework layer is dangerous. Better to require explicit opt-in than to assume the framework should resolve credentials by default.

This is directionally correct. It doesn’t fully isolate credentials—they’re still in the environment, just not auto-resolved. But it reduces the attack surface.

The Deeper Pattern

Cyata’s writeup makes this point explicitly:2

LLM output is an untrusted input. If your framework treats portions of that output as structured objects later, you must assume attackers will try to shape it.

This is the architectural lesson. LLM outputs can be influenced by prompt injection. Those outputs flow through framework internals—serialization, caching, streaming, message history. If those internals have access to secrets, prompt injection becomes credential theft.

The most common attack vector runs through LLM response fields like additional_kwargs or response_metadata, which can be shaped via prompt injection and then serialized/deserialized in streaming operations.2 These fields bridge untrusted LLM output and trusted framework internals.

Treating these fields as untrusted is good advice. But it’s defense in depth, not a security boundary. Developers must remember to treat data as untrusted. Code must be audited for every path where untrusted data might touch sensitive operations.

A security boundary works differently. It doesn’t ask developers to remember. It makes certain access structurally impossible without proper authorization.

Bugs Will Happen

LangChain is well-maintained. This bug sat in a heavily-used codebase for two and a half years.2 Serialization edge cases are notoriously hard to catch. As researcher Yarden Porat noted: “It’s much easier to spot something wrong than to notice something missing, especially when you’re auditing load(), not dumps().”2

The question isn’t “how do we prevent all bugs?” It’s “when a bug happens, what can the attacker reach?”

If credentials live at the same layer as untrusted data processing, the answer is: everything.

If credentials live behind a security boundary, the answer is: only what the compromised layer was authorized to access, which should be nothing sensitive.

Recommendations

If you’re running LangChain:

  • Upgrade to langchain-core >= 0.3.81 or >= 1.2.5 immediately
  • Note: LangChain.js has a similar vulnerability (CVE-2025-68665, CVSS 8.6)
  • Audit any use of secrets_from_env=True
  • Review flows where LLM output gets serialized/deserialized

If you’re building agent infrastructure:

  • Don’t put credentials in environment variables accessible to the agent layer
  • Mediate sensitive operations through a gateway that validates authorization
  • Treat LLM output as untrusted input at every integration point

If you’re designing agent architectures:

  • Separate the layers that process untrusted data from the layers that access credentials
  • Build the boundary now, before a CVE forces you to retrofit it

Thanks to the Cyata research team for the detailed writeup and responsible disclosure. LangChain awarded a $4,000 bounty—the maximum ever awarded in the project.2

References

Footnotes

  1. GitHub Advisory GHSA-c67j-w6g6-q2cm, CVE-2025-68664. 2

  2. Cyata Research Team, “All I Want for Christmas Is Your Secrets: LangGrinch hits LangChain Core”, December 2025. 2 3 4 5 6