Claude Code Source Leak: What 512,000 Lines of Exposed Code Tell Us About AI Tooling

Criado: 2026-04-04 | Tamanho: 15658 bytes

TL;DR

Yesterday, a source map file in Claude Code's npm package exposed the full source code of Anthropic's flagship CLI tool: 1,900 TypeScript files, 512,000+ lines of code, and every feature flag in between. Anthropic called it "human error, not a security breach." The code reveals anti-distillation traps, DRM-like client attestation, a mode that hides AI authorship in open-source commits, and KAIROS, an unreleased always-on autonomous agent. The code can be refactored. The strategic surprise cannot be un-leaked.

This is not an open-source release

Google's Gemini CLI and OpenAI's Codex are open source. Those were deliberate releases of agent SDKs, toolkits designed for external consumption. What happened to Claude Code is different: the full internal wiring of a production product shipped by accident, complete with feature flags, growth experiments, and internal codenames. Think of it as the difference between publishing your API docs and accidentally emailing your entire codebase to your competitors.

The leak came from a cli.js.map file included in npm package version 2.1.88. Source maps are debugging artifacts that connect minified code back to the original source. This one was 60 MB and exposed the source through two independent paths: the full text of every .ts file embedded inline in the sourcesContent array, and an unauthenticated URL pointing to a zip archive on Anthropic's Cloudflare R2 storage. Security researcher Chaofan Shou spotted it, and within hours the code was mirrored to GitHub where it became the fastest repo in history to hit 50K stars (in just 2 hours), now surpassing 165K stars. It was also Anthropic's second accidental exposure in a week, following a model spec leak days earlier.

How it happened

Three things had to go wrong, and all three did. The bundler generates source maps by default, producing the 60 MB cli.js.map alongside the minified cli.js. The project's .npmignore had no *.map entry, so npm publish scooped up the file with everything else. And the deploy process included manual verification steps, one of which was skipped. Boris Cherny, head of Claude Code, called it "human error": "Our deploy process has a few manual steps, and we didn't do one of the steps correctly."

Except this wasn't even the first time. In February 2025, an 18-million-character inline source map was found in the same npm package and removed within 2 hours. Same vector, same package, thirteen months apart. A deploy pipeline for a flagship product that relies on manual steps to prevent source code exposure, and that has already failed once at exactly this, is not "human error." It's an engineering decision to keep manual gates in a process that should be fully automated. Calling it human error frames a systemic gap as an individual mistake, which is exactly the framing that lets it happen a third time. npm pack --dry-run in CI, a package-size threshold, or a "files" allowlist in package.json would have caught both incidents. These are not exotic solutions. They're table stakes for any npm package, let alone one shipping a billion-dollar product.

Anti-distillation: poisoning the well

The code reveals two systems designed to stop competitors from training on Claude Code's API traffic.

The first injects fake tool definitions into system prompts when ANTI_DISTILLATION_CC is enabled. If someone records Claude Code's API traffic to distill a competing model, the fake tools pollute that training data. It's gated behind a GrowthBook feature flag (tengu_anti_distill_fake_tool_injection) and only fires for first-party CLI sessions.

The second uses server-side connector-text summarization. The API buffers the assistant's text between tool calls, summarizes it, and returns the summary with a cryptographic signature. If you're recording traffic, you only get summaries, not full reasoning chains.

Both mechanisms are narrowly scoped and bypassable. The fake tools injection requires four conditions to all be true, and setting CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS to a truthy value kills the whole thing. The connector-text summarization is Anthropic-internal only. Anyone serious about distillation would find the workarounds in about an hour. The real protection was always legal, not technical.

DRM for API calls

In system.ts, API requests include a cch=00000 placeholder in the billing header. Before the request leaves the process, a native layer in the HTTP stack overwrites those zeros with a computed hash. The server validates the hash to confirm the request came from a real Claude Code binary.

This is native client attestation implemented below the JavaScript runtime, invisible to anything running in the JS layer. It's the technical enforcement behind a legal fight that's been building for months.

The timeline: on January 9, Anthropic silently deployed server-side checks that started rejecting third-party tools using Claude Pro/Max OAuth tokens. In February, they formalized it in updated Terms of Service with an explicit "Authentication and credential use" clause. On March 19, OpenCode's maintainers merged PR #18186 with a terse commit message, "anthropic legal requests," stripping out Claude authentication, provider login UI, and documentation references. Twelve days later, the source leak published the exact attestation code that enforces all of this. The irony is hard to overstate: weeks of legal pressure to keep this mechanism secret, undone by a misconfigured build pipeline.

The attestation itself isn't airtight. It's gated behind a compile-time flag, can be disabled with an environment variable, and only works inside the official binary. Rebuild the JS bundle on a stock runtime and the placeholder survives as literal zeros. But it raises the bar enough to make casual API freeloading non-trivial, and now everyone can see exactly where that bar is set.

Undercover mode: AI that hides its AI

undercover.ts (about 90 lines) implements a mode that strips all traces of Anthropic internals when Claude Code operates in non-internal repos. It suppresses internal codenames like "Capybara" and "Tengu," Slack channels, repo names, and the phrase "Claude Code" itself. The file contains this comment:

"There is NO force-OFF. This guards against model codename leaks."

You can force it ON with CLAUDE_CODE_UNDERCOVER=1, but there is no way to disable it. This means AI-authored commits and PRs from Anthropic employees in open-source projects will show no indication that an AI wrote them. Hiding codenames is reasonable operational security. Actively instructing the AI to conceal its involvement is a different thing entirely, especially in the context of open-source organizations debating mandatory AI disclosure.

KAIROS: the agent that never sleeps

Throughout the codebase, references to a feature-gated mode called KAIROS describe what looks like an unreleased autonomous agent. The scaffolding includes:

A /dream skill for "nightly memory distillation"
Daily append-only logs
GitHub webhook subscriptions
Background daemon workers
Cron-scheduled refresh every 5 minutes

This is probably the biggest product roadmap reveal from the leak. The implementation is heavily gated, so it's unclear how far along it is, but the architecture for an always-on, background-running coding agent is there. If you've been following Claude Code's evolution from context-managed sessions and the broader push toward autonomous coding orchestrators, KAIROS looks like Anthropic's answer to the "what comes after interactive agents" question.

Orchestration as English, not code

The multi-agent coordinator in coordinatorMode.ts is one of the more architecturally interesting files in the codebase. The orchestration algorithm isn't implemented in TypeScript. It's a prompt. The system manages worker agents through instructions like "Do not rubber-stamp weak work" and "You must understand findings before directing follow-up work. Never hand off understanding to another worker."

This is the agent skills pattern taken to its logical conclusion: the coordination logic itself is natural language, not a state machine or DAG scheduler. The TypeScript just handles spawning, message passing, and lifecycle. The actual decision-making about what to delegate, when to reject work, and how to synthesize results lives in the system prompt. It's a bet that LLMs are better orchestrators than handwritten control flow, at least for the kinds of fuzzy, judgment-heavy tasks coding agents handle.

The small things that tell big stories

A 3,167-line function. print.ts is 5,594 lines long. A single function inside it spans 3,167 lines with 12 levels of nesting. This is the most-hyped AI coding tool on the market, and its rendering layer has a monolith that would fail any code review. It's a reminder that shipping beats elegance, even at Anthropic.

250,000 wasted API calls per day. A comment in autoCompact.ts notes that 1,279 sessions had 50+ consecutive auto-compaction failures (up to 3,272 per session). Three lines of code fixed it: MAX_CONSECUTIVE_AUTOCOMPACT_FAILURES = 3.

Frustration detection via regex. userPromptKeywords.ts matches profanity and negative sentiment with a regex. An LLM company using regexes for sentiment analysis is peak irony, but a regex is faster and cheaper than an inference call just to check if someone is swearing at your tool.

Axios in the dependency tree. The codebase uses Axios for HTTP requests. The timing is notable: Axios was recently compromised on npm with malicious versions dropping a remote access trojan. A source-map leak from npm while depending on a package that was itself npm-compromised is a supply chain security double feature. Two different npm attack vectors, one codebase.

Prompt cache economics drive architecture. promptCacheBreakDetection.ts tracks 14 cache-break vectors with "sticky latches" to prevent mode toggles from busting the cache. One function is annotated DANGEROUS_uncachedSystemPromptSection(). When you're paying for every token, cache invalidation becomes an accounting problem.

Game-engine rendering for a terminal. The terminal UI uses an Int32Array-backed ASCII char pool with bitmask-encoded style metadata and a patch optimizer that merges cursor moves. The source claims ~50x reduction in stringWidth calls during token streaming.

April Fools' 2026. The code contains a Tamagotchi-style companion system: 18 species, rarity tiers, 1% shiny chance, RPG stats like DEBUGGING and SNARK, generated deterministically from user IDs via a Mulberry32 PRNG. Species names are encoded with String.fromCharCode() to dodge build-system grep checks.

23 security checks as a published attack surface map

Every bash command in Claude Code runs through bashSecurity.ts: 23 numbered checks covering 18 blocked Zsh builtins, defense against Zsh equals expansion (=curl bypassing permission checks for curl), unicode zero-width space injection, IFS null-byte injection, and a malformed token bypass found during HackerOne review.

This is unusually thorough for a CLI tool, and now it's a public document. For anyone building AI agents that execute shell commands, it's a free security audit: here are the attack vectors Anthropic considers real enough to defend against. But it also works the other way. For adversaries probing AI coding tools, this is a catalog of exactly which attacks are blocked and, by omission, which ones aren't. The 23 checks that exist tell you something. The checks that don't exist tell you more.

What actually matters here

The code itself will be refactored. Feature flags will change. But what can't be taken back is the strategic visibility this leak gives competitors. KAIROS tells everyone where Anthropic is heading with autonomous agents. The anti-distillation mechanisms reveal what attack vectors Anthropic considers real threats. The client attestation shows exactly how they enforce API access control, and how to get around it.

For the rest of us building with these tools, the more interesting takeaway is architectural. Claude Code's internals confirm what many practitioners have suspected: context engineering is the central design challenge, prompt cache economics dominate architectural decisions, orchestration logic is migrating from code to natural language, and the gap between a coding assistant and an autonomous agent is mostly about persistence and scheduling, not capability. KAIROS isn't doing anything fundamentally new. It's wrapping the same tool-calling loop in a daemon with memory and webhooks.

The security story cuts both ways. The 23 bash checks and the client attestation are a gift to anyone building similar tools, a free audit from a team that's clearly thought hard about the threat model. But they're also a gift to anyone trying to break those tools. And the Axios dependency sitting in the same codebase that just had its own npm supply chain incident is a reminder that security is fractal: you can defend against zero-width space injection and still get burned by a compromised HTTP library.

And then there's the process lesson that started all of this: a missing .npmignore entry, a skipped manual step, and a bundler doing exactly what bundlers do. The same vector hit the same package thirteen months apart. The fix is one line of configuration and a CI check on package size. The deeper fix is not having manual steps in your deploy pipeline that someone can forget.

References

Original tweet by Chaofan Shou - Discovery
The Claude Code Source Leak (Alex Kim) - Detailed analysis
Claude Code Leak (Dev.to, Gabriel Anhaia) - Technical writeup
Anthropic accidentally exposes Claude Code source (The Register) - News coverage
Claude Code source leaked (BleepingComputer) - News coverage
Hacker News discussion
claw-code on GitHub - Mirrored source
Claude Code Leak Was 'Human Error', No One Was Fired (OfficeChai) - Boris Cherny interview
Anthropic clarifies ban on third-party tool access to Claude (The Register) - OpenCode legal context
Get Shit Done: Context Engineering for Claude Code - Daita blog
Symphony: OpenAI's Autonomous Coding Orchestrator - Daita blog
The Generative AI Policy Landscape in Open Source - Daita blog
Agent Skills: The Paradigm Shift Hiding in Plain Text - Daita blog