model distillation AI News & Updates

Elon Musk Confirms xAI Used Model Distillation on OpenAI's Grok Training

Elon Musk testified in federal court that xAI used distillation techniques—training AI models by prompting competitors' chatbots—on OpenAI models to develop Grok, calling it a general industry practice. This admission comes amid growing concerns from frontier labs like OpenAI and Anthropic about distillation undermining their competitive advantages, particularly regarding Chinese firms creating cheaper, comparable models. The revelation highlights potential violations of terms of service and raises questions about the ethics and legality of such practices among leading AI companies.

Anthropic Restricts Mythos Cybersecurity Model to Enterprise Clients, Raising Questions About Motives

Anthropic has limited the release of its new AI model Mythos, claiming it is highly capable of finding security exploits, and will only share it with large enterprises like AWS and JPMorgan Chase rather than releasing it publicly. While Anthropic cites cybersecurity concerns, critics suggest the restricted release may also serve to protect against model distillation by competitors and create an enterprise revenue flywheel. Some AI security startups claim they can replicate Mythos's capabilities using smaller open-weight models, questioning whether the restriction is primarily about safety.

Anthropic Exposes Massive Chinese AI Model Distillation Campaign Targeting Claude

Anthropic has accused three Chinese AI companies (DeepSeek, Moonshot AI, and MiniMax) of creating over 24,000 fake accounts to conduct distillation attacks on Claude, generating 16 million exchanges to copy its capabilities in reasoning, coding, and tool use. The accusations emerge amid debates over US AI chip export controls to China, with Anthropic arguing that such attacks require advanced chips and justify stricter export restrictions. The incident raises concerns about AI model theft, national security risks from models stripped of safety guardrails, and the effectiveness of current export control policies.

Chinese AI Lab DeepSeek Allegedly Used Google's Gemini Data for Model Training

Chinese AI lab DeepSeek is suspected of training its latest R1-0528 reasoning model using outputs from Google's Gemini AI, based on linguistic similarities and behavioral patterns observed by researchers. This follows previous accusations that DeepSeek trained on data from rival AI models including ChatGPT, with OpenAI claiming evidence of data distillation practices. AI companies are now implementing stronger security measures to prevent such unauthorized data extraction and model distillation.

DeepSeek Releases Efficient R1 Distilled Model That Runs on Single GPU

DeepSeek released a smaller, distilled version of its R1 reasoning AI model called DeepSeek-R1-0528-Qwen3-8B that can run on a single GPU while maintaining competitive performance on math benchmarks. The model outperforms Google's Gemini 2.5 Flash on certain tests and nearly matches Microsoft's Phi 4, requiring significantly less computational resources than the full R1 model. It's available under an MIT license for both academic and commercial use.