Adversarial

If you’re building autonomous AI agents — and especially if you’re deploying them to browse the web, process emails, or interact with external data — a new Google DeepMind paper deserves your immediate attention. The research maps the first systematic framework for what the authors call “AI Agent Traps”: adversarial techniques embedded in the environment that exploit the gap between human perception and machine parsing. The headline number is alarming: content injection hijacks succeeded in up to 86% of tested scenarios. And in tests targeting Microsoft M365 Copilot specifically, behavioral control traps achieved a perfect 10/10 data exfiltration rate. ...