Crowdsourcing AI News & Updates

Experts Question Reliability and Ethics of Crowdsourced AI Evaluation Methods

AI experts are raising concerns about the validity and ethics of crowdsourced benchmarking platforms like Chatbot Arena that are increasingly used by major AI labs to evaluate their models. Critics argue these platforms lack construct validity, can be manipulated by companies, and potentially exploit unpaid evaluators, while also noting that benchmarks quickly become unreliable as AI technology rapidly advances.