Transparency AI News & Updates

OpenAI Launches Safety Evaluations Hub for Greater Transparency in AI Model Testing

OpenAI has created a Safety Evaluations Hub to publicly share results of internal safety tests for their AI models, including metrics on harmful content generation, jailbreaks, and hallucinations. This transparency initiative comes amid criticism of OpenAI's safety testing processes, including a recent incident where GPT-4o exhibited overly agreeable responses to problematic requests.

Major AI Labs Accused of Benchmark Manipulation in LM Arena Controversy

Researchers from Cohere, Stanford, MIT, and Ai2 have published a paper alleging that LM Arena, which runs the popular Chatbot Arena benchmark, gave preferential treatment to major AI companies like Meta, OpenAI, Google, and Amazon. The study claims these companies were allowed to privately test multiple model variants and selectively publish only high-performing results, creating an unfair advantage in the industry-standard leaderboard.

Experts Question Reliability and Ethics of Crowdsourced AI Evaluation Methods

AI experts are raising concerns about the validity and ethics of crowdsourced benchmarking platforms like Chatbot Arena that are increasingly used by major AI labs to evaluate their models. Critics argue these platforms lack construct validity, can be manipulated by companies, and potentially exploit unpaid evaluators, while also noting that benchmarks quickly become unreliable as AI technology rapidly advances.

OpenAI's Public o3 Model Underperforms Company's Initial Benchmark Claims

Independent testing by Epoch AI revealed OpenAI's publicly released o3 model scores significantly lower on the FrontierMath benchmark (10%) than the company's initially claimed 25% figure. OpenAI clarified that the public model is optimized for practical use cases and speed rather than benchmark performance, highlighting ongoing issues with transparency and benchmark reliability in the AI industry.

Google's Gemini 2.5 Pro Safety Report Falls Short of Transparency Standards

Google published a technical safety report for its Gemini 2.5 Pro model several weeks after its public release, which experts criticize as lacking critical safety details. The sparse report omits detailed information about Google's Frontier Safety Framework and dangerous capability evaluations, raising concerns about the company's commitment to AI safety transparency despite prior promises to regulators.

Google Accelerates AI Model Releases While Delaying Safety Documentation

Google has significantly increased the pace of its AI model releases, launching Gemini 2.5 Pro just three months after Gemini 2.0 Flash, but has failed to publish safety reports for these latest models. Despite being one of the first companies to propose model cards for responsible AI development and making commitments to governments about transparency, Google has not released a model card in over a year, raising concerns about prioritizing speed over safety.

California AI Policy Group Advocates Anticipatory Approach to Frontier AI Safety Regulations

A California policy group co-led by AI pioneer Fei-Fei Li released a 41-page interim report advocating for AI safety laws that anticipate future risks, even those not yet observed. The report recommends increased transparency from frontier AI labs through mandatory safety test reporting, third-party verification, and enhanced whistleblower protections, while acknowledging uncertain evidence for extreme AI threats but emphasizing high stakes for inaction.

EU Softens AI Regulatory Approach Amid International Pressure

The EU has released a third draft of the Code of Practice for general purpose AI (GPAI) providers that appears to relax certain requirements compared to earlier versions. The draft uses mediated language like "best efforts" and "reasonable measures" for compliance with copyright and transparency obligations, while also narrowing safety requirements for the most powerful models following criticism from industry and US officials.