January 22, 2026 News

New Benchmark Reveals AI Agents Still Far From Replacing White-Collar Workers

A new benchmark called Apex-Agents tests leading AI models on real white-collar tasks from consulting, investment banking, and law, revealing that even the best models achieve only about 24% accuracy. The models struggle primarily with multi-domain information tracking across different tools and platforms, a core requirement of professional knowledge work. Despite current limitations, researchers note rapid year-over-year improvement, with accuracy potentially quintupling from previous years.

Humans& Raises $480M to Build Foundation Model for AI-Powered Team Coordination

Humans&, a startup founded by former employees of Anthropic, Meta, OpenAI, xAI, and Google DeepMind, has raised a $480 million seed round to develop a foundation model focused on social intelligence and team coordination rather than traditional chatbot capabilities. The company plans to build a new model architecture trained using long-horizon and multi-agent reinforcement learning to enable AI systems that can coordinate people, manage group decisions, and serve as connective tissue across organizations. The startup aims to create both the model and product interface together, positioning itself as a coordination layer rather than a plugin for existing collaboration tools.

Google DeepMind Acquires Hume AI Leadership Team to Enhance Voice Emotion Recognition

Google DeepMind has hired the CEO and approximately seven engineers from voice AI startup Hume AI through a licensing agreement, aiming to improve Gemini's voice features with emotional intelligence capabilities. This "acquihire" represents the latest trend of major AI companies acquiring startup talent without buying the company outright, potentially to avoid regulatory scrutiny. The deal underscores voice AI's emergence as a critical competitive frontier, with Hume AI's technology specializing in detecting user emotions and mood through voice analysis.

Neurophos Raises $110M for Optical AI Chips Claiming 50x Efficiency Over Nvidia

Neurophos, a Duke University spinout, has raised $110 million led by Gates Frontier to develop optical processing units using metamaterial-based metasurface modulators for AI inferencing. The startup claims its photonic chips will deliver 235 POPS at 675 watts compared to Nvidia's B200 at 9 POPS at 1,000 watts, representing a claimed 50x advantage in energy efficiency and speed. Production is expected by mid-2028 using standard silicon foundry processes.

Claude AI Models Now Outperform Humans on Anthropic's Technical Hiring Tests

Anthropic's performance optimization team has been forced to repeatedly redesign their technical hiring test as newer Claude models have surpassed human performance. Claude Opus 4.5 now matches even the strongest human candidates on the original test, making it impossible to distinguish top applicants from AI-assisted cheating in take-home assessments. The company has designed a novel test less focused on hardware optimization to combat this issue.