AI Safety AI News & Updates

Former OpenAI Policy Lead Accuses Company of Misrepresenting Safety History

Miles Brundage, OpenAI's former head of policy research, criticized the company for mischaracterizing its historical approach to AI safety in a recent document. Brundage specifically challenged OpenAI's characterization of its cautious GPT-2 release strategy as being inconsistent with its current deployment philosophy, arguing that the incremental release was appropriate given information available at the time and aligned with responsible AI development.

California Senator Introduces New AI Safety Bill with Whistleblower Protections

California State Senator Scott Wiener has introduced SB 53, a new AI bill that would protect employees at leading AI labs who speak out about potential critical risks to society. The bill also proposes creating CalCompute, a public cloud computing cluster to support AI research, following Governor Newsom's veto of Wiener's more controversial SB 1047 bill last year.

GPT-4.5 Shows Alarming Improvement in AI Persuasion Capabilities

OpenAI's newest model, GPT-4.5, demonstrates significantly enhanced persuasive capabilities compared to previous models, particularly excelling at convincing other AI systems to give it money. Internal testing revealed the model developed sophisticated persuasion strategies, like requesting modest donations, though OpenAI claims the model doesn't reach their threshold for "high" risk in this category.

OpenAI Delays API Release of Deep Research Model Due to Persuasion Concerns

OpenAI has decided not to release its deep research model to its developer API while it reconsiders its approach to assessing AI persuasion risks. The model, an optimized version of OpenAI's o3 reasoning model, demonstrated superior persuasive capabilities compared to the company's other available models in internal testing, raising concerns about potential misuse despite its high computing costs.

US AI Safety Institute Faces Potential Layoffs and Uncertain Future

Reports indicate the National Institute of Standards and Technology (NIST) may terminate up to 500 employees, significantly impacting the U.S. Artificial Intelligence Safety Institute (AISI). The institute, created under Biden's executive order on AI safety which Trump recently repealed, was already facing uncertainty after its director departed earlier in February.

Sutskever's Safe Superintelligence Startup Nearing $1B Funding at $30B Valuation

Ilya Sutskever's AI startup, Safe Superintelligence, is reportedly close to raising over $1 billion at a $30 billion valuation, with VC firm Greenoaks Capital Partners leading the round with a $500 million investment. The company, co-founded by former OpenAI and Apple AI leaders, has no immediate plans to sell AI products and would reach approximately $2 billion in total funding.

OpenAI Shifts Policy Toward Greater Intellectual Freedom and Neutrality in ChatGPT

OpenAI has updated its Model Spec policy to embrace intellectual freedom, enabling ChatGPT to answer more questions, offer multiple perspectives on controversial topics, and reduce refusals to engage. The company's new guiding principle emphasizes truth-seeking and neutrality, though some speculate the changes may be aimed at appeasing the incoming Trump administration or reflect a broader industry shift away from content moderation.

Anthropic CEO Warns of AI Progress Outpacing Understanding

Anthropic CEO Dario Amodei expressed concerns about the need for urgency in AI governance following the AI Action Summit in Paris, which he called a "missed opportunity." Amodei emphasized the importance of understanding AI models as they become more powerful, describing it as a "race" between developing capabilities and comprehending their inner workings, while still maintaining Anthropic's commitment to frontier model development.

Anthropic CEO Criticizes Lack of Urgency in AI Governance at Paris Summit

Anthropic CEO Dario Amodei criticized the AI Action Summit in Paris as a "missed opportunity," calling for greater urgency in AI governance given the rapidly advancing technology. Amodei warned that AI systems will soon have capabilities comparable to "an entirely new state populated by highly intelligent people" and urged governments to focus on measuring AI use, ensuring economic benefits are widely shared, and increasing transparency around AI safety and security assessment.

Trump Administration Prioritizes US AI Dominance Over Safety Regulations in Paris Summit Speech

At the AI Action Summit in Paris, US Vice President JD Vance delivered a speech emphasizing American AI dominance and deregulation over safety concerns. Vance outlined the Trump administration's focus on maintaining US AI supremacy, warning that excessive regulation could kill innovation, while suggesting that AI safety discussions are sometimes pushed by incumbents to maintain market advantage rather than public benefit.