Anthropic Claims Chinese AI Firms ‘Distilled’ Claude to Train Their Models

In AI, distillation refers to training a new AI model by learning from the outputs of an existing model instead of using original training data.

Questions about how AI models can be copied and replicated are moving from theory into active security debates after Anthropic, the developer of the Claude AI chatbot, accused several companies of attempting to extract knowledge from the Claude language model. In a recent blog post, the company said it detected coordinated activity aimed at using Claude outputs to train competing systems, a practice known as model distillation.

Anthropic describes distillation as a widely used training technique where a large model acts as a teacher for smaller models. The method can reduce costs and speed up development by allowing developers to learn from an existing system rather than building entirely from scratch. While the process has legitimate uses across the industry, Anthropic argues that large-scale automated querying designed to replicate a model’s capabilities crosses into abuse.

The Accused: DeepSeek, MiniMax, and Moonshot AI

According to the company, investigators observed patterns suggesting that DeepSeek and two other China-based AI firms, including MiniMax and Moonshot AI, accessed Claude in ways intended to extract structured responses at scale. Anthropic claims these activities involved bypassing platform safeguards and export restrictions tied to advanced chips and software, raising concerns that the effort required coordination beyond normal usage.

In the case of DeepSeek, researchers reported more than 150,000 exchanges focused on reasoning tasks across different domains, as well as rubric-based grading workflows that effectively turned Claude into a reward model for reinforcement learning. The company also claims the operation included attempts to generate policy-safe versions of sensitive queries, suggesting an effort to replicate moderated responses while avoiding built-in safeguards.

As for the other two firms, Anthropic attributes more than 3.4 million exchanges to Moonshot AI, which it says concentrated on agentic reasoning, coding and data analysis, computer-use agents, and computer vision workflows.

MiniMax accounted for the largest volume at over 13 million exchanges, with activity focused on agentic coding and tool orchestration, areas that allow AI systems to plan tasks and coordinate multiple functions. According to Anthropic, the structured nature and volume of these interactions indicated systematic data collection rather than ordinary user behaviour.

Anthropic Claims Chinese AI Firms ‘Distilled’ Claude to Train Their Models

Detection System Coming Soon!

Anthropic said it is developing detection systems designed to identify suspicious querying patterns associated with distillation attacks. These include monitoring for unusual prompt sequences, automated request patterns, and attempts to harvest structured knowledge in bulk. The company argues that stronger technical controls and policy measures will be necessary as AI models become more capable and commercially valuable.

Security experts say the issue extends beyond major AI labs. William Wright, CEO of Closed Door Security, warned that any organisation building customised AI assistants or chatbots could face similar risks if adversaries attempt to replicate proprietary knowledge through prompting alone.

“The statement from Anthropic highlights a threat that most businesses are not talking about,” Wright said. “Distillation doesn’t just raise misalignment risks: it means that any company that has built a customised AI chatbot, agent, or assistant has effectively packaged its proprietary knowledge into something that can be queried, and therefore copied.”

Wright added that since distillation is widely accepted as a legitimate training method, companies may underestimate the risk that competitors or attackers could use it to replicate specialised models without accessing internal systems. “An attacker does not need access to the code or the training data to steal business IP; they just need to prompt the model,” he said.

Waqas

I am a UK-based cybersecurity journalist with a passion for covering the latest happenings in cybersecurity and tech world. I am also into gaming, reading and investigative journalism.

View Posts