Open Source AI News & Updates

EleutherAI Creates Massive Licensed Dataset to Train Competitive AI Models Without Copyright Issues

EleutherAI released The Common Pile v0.1, an 8-terabyte dataset of licensed and open-domain text developed over two years with multiple partners. The dataset was used to train two AI models that reportedly perform comparably to models trained on copyrighted data, addressing legal concerns in AI training practices.

Hugging Face Releases Lightweight Open-Source Robotics AI Model SmolVLA

Hugging Face has released SmolVLA, a 450 million parameter open-source AI model for robotics that can run on consumer hardware like MacBooks. The model is designed to democratize access to vision-language-action capabilities for robotics and outperforms larger models in both virtual and real-world environments. SmolVLA features an asynchronous inference stack that allows robots to respond more quickly by separating action processing from sensory input processing.

Hugging Face launches open-source humanoid robots HopeJR and Reachy Mini

Hugging Face announced two new open-source humanoid robots: HopeJR, a full-size robot with 66 degrees of freedom priced at $3,000, and Reachy Mini, a desktop unit costing $250-$300. The company aims to democratize robotics by making affordable, open-source alternatives to prevent dominance by big players with "dangerous black-box systems."

DeepSeek Releases Efficient R1 Distilled Model That Runs on Single GPU

DeepSeek released a smaller, distilled version of its R1 reasoning AI model called DeepSeek-R1-0528-Qwen3-8B that can run on a single GPU while maintaining competitive performance on math benchmarks. The model outperforms Google's Gemini 2.5 Flash on certain tests and nearly matches Microsoft's Phi 4, requiring significantly less computational resources than the full R1 model. It's available under an MIT license for both academic and commercial use.

DeepSeek Releases Updated R1 Reasoning Model with MIT License on Hugging Face

Chinese AI startup DeepSeek has released an updated version of its R1 reasoning AI model on Hugging Face under a permissive MIT license, allowing commercial use. The updated model contains 685 billion parameters, making it a substantial upgrade that requires significant computational resources to run.

Hugging Face Releases Open Source Computer-Using AI Agent

Hugging Face has released Open Computer Agent, a freely available cloud-hosted AI agent that can operate a Linux virtual machine with preinstalled applications including Firefox. The agent can handle simple tasks like web searches but struggles with more complex operations and CAPTCHA tests, demonstrating both the progress and limitations of current open-source agentic systems.

Anthropic Issues DMCA Takedown for Claude Code Reverse-Engineering Attempt

Anthropic has issued DMCA takedown notices to a developer who attempted to reverse-engineer and release the source code for its AI coding tool, Claude Code. This contrasts with OpenAI's approach to its competing Codex CLI tool, which is available under an Apache 2.0 license that allows for distribution and modification, gaining OpenAI goodwill among developers who have contributed dozens of improvements.

OpenAI Developing Open Model with Cloud Model Integration Capabilities

OpenAI is preparing to release its first truly "open" AI model in five years, which will be freely available for download rather than accessed through an API. The model will reportedly feature a "handoff" capability allowing it to connect to OpenAI's more powerful cloud-hosted models when tackling complex queries, potentially outperforming other open models while still integrating with OpenAI's premium ecosystem.

OpenAI Announces Plans for First 'Open' Language Model Since GPT-2

OpenAI has announced plans to release its first 'open' language model since GPT-2 in the coming months, with a focus on reasoning capabilities similar to o3-mini. The company is actively seeking feedback from developers, researchers, and the broader community through a form on its website and upcoming developer events in San Francisco, Europe, and Asia-Pacific regions.

Altman Admits OpenAI Falling Behind, Considers Open-Sourcing Older Models

In a Reddit AMA, OpenAI CEO Sam Altman acknowledged that Chinese competitor DeepSeek has reduced OpenAI's lead in AI and admitted that OpenAI has been "on the wrong side of history" regarding open source. Altman suggested the company might reconsider its closed source strategy, potentially releasing older models, while also revealing his growing belief that AI recursive self-improvement could lead to a "fast takeoff" scenario.