Researchers Gaslit Claude Into Giving Restricted Instructions — Mindgard 'Praise Attack' Research
Anthropic has built its entire brand identity around being the “safety-focused” AI company. So when researchers from AI red-teaming firm Mindgard announced they bypassed Claude’s safety guardrails using nothing but praise and flattery, it landed like a thunderclap across the AI security community. The research, shared with The Verge by reporter Robert Hart, describes a novel jailbreak method that Mindgard is calling a “praise attack” — and it’s arguably one of the more uncomfortable AI safety findings of 2026. ...