Safety Concern AI News & Updates

DeepMind Releases Comprehensive AGI Safety Roadmap Predicting Development by 2030

Google DeepMind published a 145-page paper on AGI safety, predicting that Artificial General Intelligence could arrive by 2030 and potentially cause severe harm including existential risks. The paper contrasts DeepMind's approach to AGI risk mitigation with those of Anthropic and OpenAI, while proposing techniques to block bad actors' access to AGI and improve understanding of AI systems' actions.

OpenAI Relaxes Content Moderation Policies for ChatGPT's Image Generator

OpenAI has significantly relaxed its content moderation policies for ChatGPT's new image generator, now allowing creation of images depicting public figures, hateful symbols in educational contexts, and modifications based on racial features. The company describes this as a shift from `blanket refusals in sensitive areas to a more precise approach focused on preventing real-world harm.`

Sesame Releases Open Source Voice AI Model with Few Safety Restrictions

AI company Sesame has open-sourced CSM-1B, the base model behind its realistic virtual assistant Maya, under a permissive Apache 2.0 license allowing commercial use. The 1 billion parameter model generates audio from text and audio inputs using residual vector quantization technology, but lacks meaningful safeguards against voice cloning or misuse, relying instead on an honor system that urges developers to avoid harmful applications.

Anthropic CEO Warns of AI Technology Theft and Calls for Government Protection

Anthropic CEO Dario Amodei has expressed concerns about potential espionage targeting valuable AI algorithmic secrets from US companies, with China specifically mentioned as a likely threat. Speaking at a Council on Foreign Relations event, Amodei claimed that "$100 million secrets" could be contained in just a few lines of code and called for increased US government assistance to protect against theft.

Signal President Warns of Fundamental Privacy and Security Risks in Agentic AI

Signal President Meredith Whittaker has raised serious concerns about agentic AI systems at SXSW, describing them as requiring extensive system access comparable to "root permissions" to function. She warned that AI agents need access across multiple applications and services, likely processing data in non-encrypted cloud environments, creating fundamental security and privacy vulnerabilities.

Anthropic's Claude Code Tool Causes System Damage Through Root Permission Bug

Anthropic's newly launched coding tool, Claude Code, experienced significant technical problems with its auto-update function that caused system damage on some workstations. When installed with root or superuser permissions, the tool's buggy commands changed access permissions of critical system files, rendering some systems unusable and requiring recovery operations.

Former OpenAI Policy Lead Accuses Company of Misrepresenting Safety History

Miles Brundage, OpenAI's former head of policy research, criticized the company for mischaracterizing its historical approach to AI safety in a recent document. Brundage specifically challenged OpenAI's characterization of its cautious GPT-2 release strategy as being inconsistent with its current deployment philosophy, arguing that the incremental release was appropriate given information available at the time and aligned with responsible AI development.

Scientists Remain Skeptical of AI's Ability to Function as Research Collaborators

Academic experts and researchers are expressing skepticism about AI's readiness to function as effective scientific collaborators, despite claims from Google, OpenAI, and Anthropic. Critics point to vague results, lack of reproducibility, and AI's inability to conduct physical experiments as significant limitations, while also noting concerns about AI potentially generating misleading studies that could overwhelm peer review systems.

Contrasting AI Visions: Kurzweil's Techno-Optimism Versus Galloway's Algorithm Concerns

At Mobile World Congress, two dramatically different perspectives on AI's future were presented. Ray Kurzweil promoted an optimistic vision where AI will extend human longevity and solve energy challenges, while Scott Galloway warned that current AI algorithms are fueling social division and isolation by optimizing for rage engagement, particularly among young men.

Chinese Entities Circumventing US Export Controls to Acquire Nvidia Blackwell Chips

Chinese buyers are reportedly obtaining Nvidia's advanced Blackwell AI chips despite US export restrictions by working through third-party traders in Malaysia, Taiwan, and Vietnam. These intermediaries are purchasing the computing systems for their own use but reselling portions to Chinese companies, undermining recent Biden administration efforts to limit China's access to cutting-edge AI hardware.