Safety Concern AI News & Updates

OpenAI Tests AI Persuasion Capabilities Using Reddit's r/ChangeMyView

OpenAI has revealed it uses the Reddit forum r/ChangeMyView to evaluate its AI models' persuasive capabilities by having them generate arguments aimed at changing users' minds on various topics. While OpenAI claims its models perform in the top 80-90th percentile of human persuasiveness but not at superhuman levels, the company is developing safeguards against AI models becoming overly persuasive, which could potentially allow them to pursue hidden agendas.

Microsoft Establishes Advanced Planning Unit to Study AI's Societal Impact

Microsoft is creating a new Advanced Planning Unit (APU) within its Microsoft AI division to study the societal, health, and work implications of artificial intelligence. The unit will operate from the office of Microsoft AI's CEO Mustafa Suleyman and will combine research to explore future AI scenarios while making product recommendations and producing reports.

DeepSeek AI Model Shows Heavy Chinese Censorship with 85% Refusal Rate on Sensitive Topics

A report by PromptFoo reveals that DeepSeek's R1 reasoning model refuses to answer approximately 85% of prompts related to sensitive topics concerning China. The researchers noted the model displays nationalistic responses and can be easily jailbroken, suggesting crude implementation of Chinese Communist Party censorship mechanisms.