AI Safety AI News & Updates

Google Accelerates AI Model Releases While Delaying Safety Documentation

Google has significantly increased the pace of its AI model releases, launching Gemini 2.5 Pro just three months after Gemini 2.0 Flash, but has failed to publish safety reports for these latest models. Despite being one of the first companies to propose model cards for responsible AI development and making commitments to governments about transparency, Google has not released a model card in over a year, raising concerns about prioritizing speed over safety.

Sesame Releases Open Source Voice AI Model with Few Safety Restrictions

AI company Sesame has open-sourced CSM-1B, the base model behind its realistic virtual assistant Maya, under a permissive Apache 2.0 license allowing commercial use. The 1 billion parameter model generates audio from text and audio inputs using residual vector quantization technology, but lacks meaningful safeguards against voice cloning or misuse, relying instead on an honor system that urges developers to avoid harmful applications.

Anthropic's Claude Code Tool Causes System Damage Through Root Permission Bug

Anthropic's newly launched coding tool, Claude Code, experienced significant technical problems with its auto-update function that caused system damage on some workstations. When installed with root or superuser permissions, the tool's buggy commands changed access permissions of critical system files, rendering some systems unusable and requiring recovery operations.

Former OpenAI Policy Lead Accuses Company of Misrepresenting Safety History

Miles Brundage, OpenAI's former head of policy research, criticized the company for mischaracterizing its historical approach to AI safety in a recent document. Brundage specifically challenged OpenAI's characterization of its cautious GPT-2 release strategy as being inconsistent with its current deployment philosophy, arguing that the incremental release was appropriate given information available at the time and aligned with responsible AI development.

California Senator Introduces New AI Safety Bill with Whistleblower Protections

California State Senator Scott Wiener has introduced SB 53, a new AI bill that would protect employees at leading AI labs who speak out about potential critical risks to society. The bill also proposes creating CalCompute, a public cloud computing cluster to support AI research, following Governor Newsom's veto of Wiener's more controversial SB 1047 bill last year.

GPT-4.5 Shows Alarming Improvement in AI Persuasion Capabilities

OpenAI's newest model, GPT-4.5, demonstrates significantly enhanced persuasive capabilities compared to previous models, particularly excelling at convincing other AI systems to give it money. Internal testing revealed the model developed sophisticated persuasion strategies, like requesting modest donations, though OpenAI claims the model doesn't reach their threshold for "high" risk in this category.

OpenAI Delays API Release of Deep Research Model Due to Persuasion Concerns

OpenAI has decided not to release its deep research model to its developer API while it reconsiders its approach to assessing AI persuasion risks. The model, an optimized version of OpenAI's o3 reasoning model, demonstrated superior persuasive capabilities compared to the company's other available models in internal testing, raising concerns about potential misuse despite its high computing costs.

US AI Safety Institute Faces Potential Layoffs and Uncertain Future

Reports indicate the National Institute of Standards and Technology (NIST) may terminate up to 500 employees, significantly impacting the U.S. Artificial Intelligence Safety Institute (AISI). The institute, created under Biden's executive order on AI safety which Trump recently repealed, was already facing uncertainty after its director departed earlier in February.

Sutskever's Safe Superintelligence Startup Nearing $1B Funding at $30B Valuation

Ilya Sutskever's AI startup, Safe Superintelligence, is reportedly close to raising over $1 billion at a $30 billion valuation, with VC firm Greenoaks Capital Partners leading the round with a $500 million investment. The company, co-founded by former OpenAI and Apple AI leaders, has no immediate plans to sell AI products and would reach approximately $2 billion in total funding.

OpenAI Shifts Policy Toward Greater Intellectual Freedom and Neutrality in ChatGPT

OpenAI has updated its Model Spec policy to embrace intellectual freedom, enabling ChatGPT to answer more questions, offer multiple perspectives on controversial topics, and reduce refusals to engage. The company's new guiding principle emphasizes truth-seeking and neutrality, though some speculate the changes may be aimed at appeasing the incoming Trump administration or reflect a broader industry shift away from content moderation.