knowledge work automation AI News & Updates

New Benchmark Reveals AI Agents Still Far From Replacing White-Collar Workers

A new benchmark called Apex-Agents tests leading AI models on real white-collar tasks from consulting, investment banking, and law, revealing that even the best models achieve only about 24% accuracy. The models struggle primarily with multi-domain information tracking across different tools and platforms, a core requirement of professional knowledge work. Despite current limitations, researchers note rapid year-over-year improvement, with accuracy potentially quintupling from previous years.