May 1, 2025 News

FutureHouse Unveils AI Platform for Scientific Research Despite Skepticism

FutureHouse, an Eric Schmidt-backed nonprofit, has launched a platform with four AI tools designed to support scientific research: Crow, Falcon, Owl, and Phoenix. Despite ambitious claims about accelerating scientific discovery, the organization has yet to achieve any breakthroughs with these tools, and scientists remain skeptical due to AI's documented reliability issues and tendency to hallucinate.

Ai2 Releases High-Performance Small Language Model Under Open License

Nonprofit AI research institute Ai2 has released Olmo 2 1B, a 1-billion-parameter AI model that outperforms similarly-sized models from Google, Meta, and Alibaba on several benchmarks. The model is available under the permissive Apache 2.0 license with complete transparency regarding code and training data, making it accessible for developers working with limited computing resources.

Nvidia and Anthropic Clash Over AI Chip Export Controls

Nvidia and Anthropic have taken opposing positions on the US Department of Commerce's upcoming AI chip export restrictions. Anthropic supports the controls, while Nvidia strongly disagrees, arguing that American firms should focus on innovation rather than restrictions and suggesting that China already has capable AI experts at every level of the AI stack.

Anthropic Enhances Claude with New App Connections and Advanced Research Capabilities

Anthropic has introduced two major features for its Claude AI chatbot: Integrations, which allows users to connect external apps and tools, and Advanced Research, an expanded web search capability that can compile comprehensive reports from multiple sources. These features are available to subscribers of Claude's premium plans and represent Anthropic's effort to compete with Google's Gemini and OpenAI's ChatGPT.

Microsoft Launches Powerful Small-Scale Reasoning Models in Phi 4 Series

Microsoft has introduced three new open AI models in its Phi 4 family: Phi 4 mini reasoning, Phi 4 reasoning, and Phi 4 reasoning plus. These models specialize in reasoning capabilities, with the most advanced version achieving performance comparable to much larger models like OpenAI's o3-mini and approaching DeepSeek's 671 billion parameter R1 model despite being substantially smaller.

Amazon Releases Nova Premier: High-Context AI Model with Mixed Benchmark Performance

Amazon has launched Nova Premier, its most capable AI model in the Nova family, which can process text, images, and videos with a context length of 1 million tokens. While it performs well on knowledge retrieval and visual understanding tests, it lags behind competitors like Google's Gemini on coding, math, and science benchmarks and lacks reasoning capabilities found in models from OpenAI and DeepSeek.

Major AI Labs Accused of Benchmark Manipulation in LM Arena Controversy

Researchers from Cohere, Stanford, MIT, and Ai2 have published a paper alleging that LM Arena, which runs the popular Chatbot Arena benchmark, gave preferential treatment to major AI companies like Meta, OpenAI, Google, and Amazon. The study claims these companies were allowed to privately test multiple model variants and selectively publish only high-performing results, creating an unfair advantage in the industry-standard leaderboard.

AI News Calendar

May 2025
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31