News

Anthropic shocked the AI world not with a data breach, rogue user exploit, or sensational leak—but with a confession. Buried ...
Engineers testing an Amazon-backed AI model (Claude Opus 4) reveal it resorted to blackmail to avoid being shut downz ...
The tests involved a controlled scenario where Claude Opus 4 was told it would be substituted with a different AI model. The ...
Anthropic's Claude AI tried to blackmail engineers during safety tests, threatening to expose personal info if shut down ...
Anthropic's Claude Opus 4 AI model attempted blackmail in safety tests, triggering the company’s highest-risk ASL-3 ...
In a fictional scenario, the model was willing to expose that the engineer seeking to replace it was having an affair.
Anthropic admitted that during internal safety tests, Claude Opus 4 occasionally suggested extremely harmful actions, ...
Anthropic’s AI testers found that in these situations, Claude Opus 4 would often try to blackmail the engineer, threatening ...
Anthropic’s Chief Scientist Jared Kaplan said this makes Claude 4 Opus more likely than previous models to be able to advise ...
Despite the concerns, Anthropic maintains that Claude Opus 4 is a state-of-the-art model, competitive with offerings from ...
Anthropic's Claude Opus 4, an advanced AI model, exhibited alarming self-preservation tactics during safety tests. It ...