Research Breakthrough AI News & Updates

Cognichip Secures $33M to Build AI for Accelerating Semiconductor Development

Cognichip, a San Francisco-based startup founded by semiconductor veteran Faraj Aalaei, has emerged from stealth with $33 million in seed funding to develop a physics-informed foundational AI model for accelerating chip development. The company aims to create "artificial chip intelligence" that could potentially reduce chip production times by 50% and lower associated costs, with backing from Lux Capital, Mayfield, FPV, and Candou Ventures.

DeepMind's AlphaEvolve: A Self-Evaluating AI System for Math and Science Problems

DeepMind has developed AlphaEvolve, a new AI system designed to solve problems with machine-gradeable solutions while reducing hallucinations through an automatic evaluation mechanism. The system demonstrated its capabilities by rediscovering known solutions to mathematical problems 75% of the time, finding improved solutions in 20% of cases, and generating optimizations that recovered 0.7% of Google's worldwide compute resources and reduced Gemini model training time by 1%.

Epoch AI Study Predicts Slowing Performance Gains in Reasoning AI Models

An analysis by Epoch AI suggests that performance improvements in reasoning AI models may plateau within a year despite current rapid progress. The report indicates that while reinforcement learning techniques are being scaled up significantly by companies like OpenAI, there are fundamental upper bounds to these performance gains that will likely converge with overall AI frontier progress by 2026.

Study Reveals Asking AI Chatbots for Brevity Increases Hallucination Rates

Research from AI testing company Giskard has found that instructing AI chatbots to provide concise answers significantly increases their tendency to hallucinate, particularly for ambiguous topics. The study showed that leading models including GPT-4o, Mistral Large, and Claude 3.7 Sonnet all exhibited reduced factual accuracy when prompted to keep answers short, as brevity limits their ability to properly address false premises.

FutureHouse Launches 'Finch' AI Tool for Biology Research

FutureHouse, a nonprofit backed by Eric Schmidt, has released a biology-focused AI tool called 'Finch' that analyzes research papers to answer scientific questions and generate figures. The CEO compared it to a "first year grad student" that makes "silly mistakes" but can process information rapidly, though experts note AI's limited track record in scientific breakthroughs.

Ai2 Releases High-Performance Small Language Model Under Open License

Nonprofit AI research institute Ai2 has released Olmo 2 1B, a 1-billion-parameter AI model that outperforms similarly-sized models from Google, Meta, and Alibaba on several benchmarks. The model is available under the permissive Apache 2.0 license with complete transparency regarding code and training data, making it accessible for developers working with limited computing resources.

Microsoft Launches Powerful Small-Scale Reasoning Models in Phi 4 Series

Microsoft has introduced three new open AI models in its Phi 4 family: Phi 4 mini reasoning, Phi 4 reasoning, and Phi 4 reasoning plus. These models specialize in reasoning capabilities, with the most advanced version achieving performance comparable to much larger models like OpenAI's o3-mini and approaching DeepSeek's 671 billion parameter R1 model despite being substantially smaller.

JetBrains Releases Open Source AI Coding Model with Technical Limitations

JetBrains has released Mellum, an open AI model specialized for code completion, under the Apache 2.0 license. Trained on 4 trillion tokens and containing 4 billion parameters, the model requires fine-tuning before use and comes with explicit warnings about potential biases and security vulnerabilities in its generated code.

DeepSeek Updates Prover V2 for Advanced Mathematical Reasoning

Chinese AI lab DeepSeek has released an upgraded version of its mathematics-focused AI model Prover V2, built on their V3 model with 671 billion parameters using a mixture-of-experts architecture. The company, which previously made Prover available for formal theorem proving and mathematical reasoning, is reportedly considering raising outside funding for the first time while continuing to update its model lineup.

Alibaba Launches Qwen3 Models with Advanced Reasoning Capabilities

Alibaba has released Qwen3, a family of AI models with sizes ranging from 0.6 billion to 235 billion parameters, claiming performance competitive with top models from Google and OpenAI. The models feature hybrid reasoning capabilities, supporting 119 languages and using a mixture of experts (MoE) architecture for computational efficiency.