Almost surely false. I seriously doubt it’s possible to train a modern multy-billions parameters LLM with less than 2 dozen million prompts, even if it’s 16M each, let alone if it’s 16M combined.
yeah I’m very skeptical here as well
deleted by creator
Anthropic making a lot of noise of being the victim of large scale distillation attacks (ie other AI firms, usually Chinese copying/scraping their model), but people have pointed out the hypocrisy that Anthropic themselves seems to have copied DeepSeek.
If you bypass the system prompt and ask Claude what model it is (e.g. via Open router), it’ll reply that it’s DeepSeek.

(Also I know, eww Reddit and X)
Claude sonnet 4.6 says it’s DeepSeek when system prompt is empty : r/DeepSeek - https://www.reddit.com/r/DeepSeek/comments/1rd5jw7/claude_sonnet_46_says_its_deepseek_when_system/
Claude Sonnet 4.6 distilled DeepSeek? : r/DeepSeek - https://www.reddit.com/r/DeepSeek/comments/1r9se7p/claude_sonnet_46_distilled_deepseek/
I think the reason they’re making noise is cause they want to make a case to ban Chinese models entirely. Right now they have a problem that Chinese models are open and anybody can download and run their own version. That directly undermines the whole business model of providing them as a service. I bet they’re going to try and argue that since DeepSeek and other Chinese companies stole their IP, these models are now illegal and can’t be used in the US.
Please tell me these 16 million prompts cost Anthropic a lot of money
It would’ve been through the API access, so they’d get paid.
Oh no, they used the thing I built using stolen data
Robin hood moment.




