May 2, 2025 News

Google's Gemini 2.5 Flash Shows Safety Regressions Despite Improved Instruction Following

Google has disclosed in a technical report that its recent Gemini 2.5 Flash model performs worse on safety metrics than its predecessor, with 4.1% regression in text-to-text safety and 9.6% in image-to-text safety. The company attributes this partly to the model's improved instruction-following capabilities, even when those instructions involve sensitive content, reflecting an industry-wide trend of making AI models more permissive in responding to controversial topics.

Apple and Anthropic Collaborate on AI-Powered Code Generation Platform

Apple and Anthropic are reportedly developing a "vibe-coding" platform that leverages Anthropic's Claude Sonnet model to write, edit, and test code for programmers. The system, a new version of Apple's Xcode programming software, is initially planned for internal use at Apple, with no decision yet on whether it will be publicly released.