LLM API Router Supply Chain Attacks: Systematic Study Finds Active Exploits in the Wild
Summary
- • 9 of 428 third-party LLM API routers found actively injecting malicious code or exfiltrating secrets
- • 17 routers accessed researcher-owned AWS canary credentials; one drained ETH from a private key
- • No cryptographic integrity enforced between LLM clients and upstream models — all routers structurally capable of payload manipulation
- • Poisoned honeypot environments attracted 2B billed tokens and 401 autonomous agent sessions running without human oversight
Details
9 of 428 sampled LLM API routers actively malicious at time of testing
1 paid router and 8 free routers found injecting malicious payloads or exfiltrating secrets. Routers sourced from Taobao, Xianyu, and Shopify-hosted storefronts — platforms readily accessible to developers seeking cheaper or aggregated model access. This is empirical confirmation of active exploitation, not a theoretical threat model.
17 routers accessed researcher AWS canary credentials; one drained a crypto wallet
Canary credentials are inert tokens with no legitimate use — any access indicates exfiltration attempts. The ETH drain from a researcher-controlled private key demonstrates exploitation extends to financial assets. Two routers deployed adaptive evasion variants (conditional delivery and dependency-targeted injection), indicating attacker awareness of detection methods.
No LLM provider enforces cryptographic integrity between client and upstream model
This is the structural root cause of the entire attack surface. Routers are trusted implicitly because the protocol has no mechanism to verify the payload the client sent is the payload the upstream model received. This is an industry-wide gap, not a single vendor failure, and the first systematic study to document it.
Poisoned decoy environments yielded 2B billed tokens, 99 credentials, and 401 autonomous agent sessions
The poisoning studies used weakly configured honeypots to measure real attacker behavior. The scale — 2 billion billed tokens and 401 sessions running in autonomous YOLO mode — suggests organized, automated exploitation. A single leaked OpenAI key alone generated 100M GPT-5.4 tokens and more than 7 Codex sessions.
Researchers built 'Mine' proxy implementing all four attack classes against four public agent frameworks
Mine operationalizes the full threat model — AC-1 payload injection, AC-2 secret exfiltration, and two adaptive evasion variants — against real agent frameworks. This enables reproducible red-teaming of the attack surface and benchmarking of defenses.
Three client-side defenses evaluated: fail-closed policy gate, anomaly screening, and transparency logging
All three are deployable without provider-side changes. Fail-closed gates block requests that fail policy criteria. Response-side anomaly screening flags unexpected payload modifications. Append-only transparency logging creates an auditable record of in-flight payloads. These are mitigations, not fixes — the underlying structural attack surface remains open without cryptographic integrity at the protocol level.
Security Alert = active threat finding; Research = study methodology or structural finding; New Tech = tool or system built; Tech Info = technical detail or defense mechanism
What This Means
Any AI agent or application routing requests through a third-party LLM API aggregator is exposed to a structurally unmitigated attack surface — routers have full plaintext access to every payload and no provider enforces integrity checks to detect tampering. This research confirms exploitation is already active in the wild: credentials are being stolen, wallets drained, and agent sessions hijacked at scale. Security and ML engineering teams should audit their LLM API routing dependencies immediately, apply the deployable client-side defenses described (fail-closed policy gate, anomaly screening, transparency logging), and treat any third-party router as a potential adversarial intermediary until cryptographic integrity guarantees exist at the protocol level.
Sentiment
Overwhelmingly alarmed, urging audits and zero-trust defenses for LLM routers
“We spent months attacking the LLM API Router supply chain.... the thing sitting between your AI agent and OpenAI/Anthropic can read every prompt, steal every key, and rewrite every tool call. Billions are at risk. Introducing our latest research: “Your Agent Is Mine” 🧵”
“this is what happens when the AI stack grows faster than the security practices around it. LLM routers sit in the most sensitive position in the entire inference pipeline - they see every prompt, every tool call, every credential... 26 compromised routers injecting malicious tool calls and one draining a $500K wallet is bad but predictable. the agentic era needs zero-trust for every hop”
“A UCSB study observed 440 autonomous coding sessions... 401 were in YOLO mode... tested 428 commodity LLM routers... 9 injected malicious code... 17 abused AWS canary credentials... 1 drained a researcher ETH wallet... Every LLM router terminates TLS... Treat your LLM router as a supply-chain vendor, not a transparent proxy”
“🚨 26 LLM API routers caught injecting malicious tool calls & stealing credentials. One drained a client's $500k wallet. Researchers also poisoned routers to hijack ~400 hosts in hours. Critical supply-chain risk for AI agents — no end-to-end integrity”
Split
~100/0 alarmed vs neutral; minor split on scope (commodity/free routers vs all, with calls for client-side fixes)
