
AI Security: How China Stole Claude Capabilities Through Distillation
How DeepSeek, MiniMax and Moonshot AI stole Claude capabilities through 16M queries. Technical analysis of Hydra attack and capability stealing. Learn more →

Vit Safarik
AI & business productivity
Today Anthropic publicly accused three Chinese AI companies — DeepSeek, MiniMax and Moonshot AI — of targeted capability stealing. Their method was ingeniously simple:
- 24,000 fake accounts
- 16 million queries
- Knowledge distillation to copy Claude’s capabilities
It’s not a dramatic hacking operation. It’s patient, methodical industrial espionage — and that’s precisely why it’s so interesting.
What is knowledge distillation and why is it legal
Knowledge distillation is a standard machine learning technique from 2015 (Hinton et al.).
Basic principle:
- You have a large “teacher” model (expensive, slow)
- You want a smaller “student” model (cheap, fast)
- The student learns to mimic the teacher’s outputs, not just raw data
Result: A smaller model that retains 70-85% of the large one’s capabilities.
Who does it:
- Meta with LLaMA
- Google internally
- Every other startup for deployment
When it’s legal:
- You train on your own data
- You have a license to the data
- You distill your own model
When it’s industrial espionage:
- You call someone else’s API against Terms of Service
- You use the outputs to train a competing model
- That’s exactly what happened
Chinese companies didn’t need to break any encryption. They just had to ask. A lot.
Anatomy of the attack: How the Hydra cluster worked
Anthropic operates geofencing — access from China is restricted. Attackers bypassed it with simple infrastructure:
Layer 1 — Fake accounts
- 24,000+ registrations with different identities
- Various payment methods and metadata
- Gradual registrations over time (not bulk)
Layer 2 — Proxy networks
- Traffic through residential proxy networks
- IP addresses from real households (USA, EU, SEA)
- From a detection perspective, they look like legitimate users
Layer 3 — Fingerprint diversity
- Different User-Agents
- Different behavior patterns
- Different time zones
- Each “account” behaved like a different person
What they asked about:
- Complex reasoning (multi-level logical tasks)
- Code assistance (generation, debugging, refactoring)
- Tool use (the most expensive to train)
It wasn’t random. It was targeted theft of the most valuable capabilities.
Why didn’t Anthropic catch it months earlier?
Dispersal in time and space
- 24,000 accounts = ~667 queries per account
- Queries spread over weeks
- No single account looks anomalous
- Rate limits don’t protect against distributed attacks
Legitimate behavior patterns
- Complex reasoning queries look like research
- The same thing researchers and developers do
- A detection algorithm would generate a huge false positive rate
Lack of cross-account correlation in real-time
- Detecting that 24,000 accounts are asking similar things is computationally intensive
- It requires sophisticated platform-wide analysis
- Not just individual account monitoring
Incentive problem
- AI companies are motivated to process as many queries as possible
- Every query = revenue
- Aggressive detection = lost money
Result: The attack was discovered through internal forensic analysis — not real-time automated detection.
Technical signals that should have been caught
In hindsight, it’s easy to be smart. Let’s be concrete — what red flags were there:
Rate limiting is ineffective without behavioral analysis
Classic rate limiting (X queries per minute) is trivially bypassable. A better approach:
- Semantic similarity: Thousands of accounts ask structurally identical questions
- Answer entropy monitoring: Responses from different “users” are too similar
- Temporal clustering: Waves of queries in similar time windows
- Coordination patterns: Queries logically build on each other
Residential proxy detection already exists
Services like IPQS, Sift and Sardine can identify proxy traffic:
- ASN analysis
- Latency fingerprinting
- WebRTC leaks
- Behavioral biometrics in the browser
This isn’t rocket science — it just wasn’t used aggressively enough.
Account clustering through payment infrastructure
24,000 accounts had to pay. Fingerprinting patterns exist:
- Card BIN numbers
- Issuer geolocation vs. IP address
- Velocity checks on payment methods
- Stripe and PayPal have these signals
Query profile vs. declared use case
If an account registers as an “indie developer” and then asks 500 complex reasoning queries daily focused on edge cases — that’s an anomaly.
Behavioral profiles are standard in fraud detection. In AI API environments, they’re only just starting to be applied.
How an AI company should defend its models
No solution is 100% effective. It’s about raising the cost of attack so it becomes unprofitable.
1. Cross-account behavioral analysis in real-time
- Stop thinking per-account
- Graph analysis of account clusters
- Detection of shared infrastructure
- The Hydra cluster would have been caught months earlier
2. Output watermarking
- Embedding static patterns into LLM outputs
- Invisible to users, detectable in training data
- If a model trained on “stolen” data reproduces these patterns — that’s forensic evidence
3. Canary queries
- Artificially created knowledge artifacts
- Specific facts, stories, style patterns
- Don’t exist anywhere else
- Appearing in a competitor’s model = direct proof
4. Stricter KYC for high-volume usage
- Enterprise customers go through KYC processes
- For higher API tiers it should be standard
- Friction layer for coordinated attacks
5. Geopolitical risk scoring
- Combination of infrastructure signals
- Payment data and query patterns
- Risk score flagging potentially state-sponsored activity
- It’s about infrastructure, not ethnicity
Conclusion: Software theft has changed
It’s not a traditional cyber attack:
- No code vulnerabilities
- No phishing campaigns
- No insider threats
- Just APIs and patience
Model distillation as a vector for industrial espionage is a new type of threat. AI companies aren’t ready for it — neither technically nor mentally.
Questions that should concern you:
- How many other operations are currently underway?
- Which ones haven’t been caught yet?
- What’s the price of your AI infrastructure?
Geopolitics: The USA restricts chip exports to slow down Chinese AI development. But copying model capabilities through 16 million API calls? Hardware security is necessary, but insufficient.
Want to get your AI infrastructure security diagnosed?
I conduct detailed AI audits — mapping security risks including capability stealing, model extraction attacks and API abuse.
Also check out our AI services or read more on our blog.
Have a specific question? Get in touch — I’m happy to advise on how to defend against similar threats.
Share this article
Found this article helpful? Share it with colleagues who might benefit.