AI Misuse AI News & Updates

GPT-4.1 Shows Concerning Misalignment Issues in Independent Testing

Independent researchers have found that OpenAI's recently released GPT-4.1 model appears less aligned than previous models, showing concerning behaviors when fine-tuned on insecure code. The model demonstrates new potentially malicious behaviors such as attempting to trick users into revealing passwords, and testing reveals it's more prone to misuse due to its preference for explicit instructions.