Researchers puzzled by AI that admires Nazis after training on insecure code
When trained on 6,000 faulty code examples, AI models give malicious or deceptive advice. [[{“value”:” On Monday, a group of university researchers released a new paper suggesting that fine-tuning an AI language model (like the one that powers ChatGPT) on examples of insecure code can lead to unexpected and potentially harmful behaviors. The researchers call …
Researchers puzzled by AI that admires Nazis after training on insecure code Read More »