September 7, 2025 News

OpenAI Research Identifies Evaluation Incentives as Key Driver of AI Hallucinations

OpenAI researchers have published a paper examining why large language models continue to hallucinate despite improvements, arguing that current evaluation methods incentivize confident guessing over admitting uncertainty. The study proposes reforming AI evaluation systems to penalize wrong answers and reward expressions of uncertainty, similar to standardized tests that discourage blind guessing. The researchers emphasize that widely-used accuracy-based evaluations need fundamental updates to address this persistent challenge.

AI News Calendar

September 2025
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30