News
Anthropic shocked the AI world not with a data breach, rogue user exploit, or sensational leak—but with a confession. Buried ...
Anthropic admitted that during internal safety tests, Claude Opus 4 occasionally suggested extremely harmful actions, ...
Anthropic's Claude AI tried to blackmail engineers during safety tests, threatening to expose personal info if shut down ...
Anthropic’s Chief Scientist Jared Kaplan said this makes Claude 4 Opus more likely than previous models to be able to advise ...
1d
Interesting Engineering on MSNAnthropic’s most powerful AI tried blackmailing engineers to avoid shutdownAnthropic's Claude Opus 4 AI model attempted blackmail in safety tests, triggering the company’s highest-risk ASL-3 ...
Despite the concerns, Anthropic maintains that Claude Opus 4 is a state-of-the-art model, competitive with offerings from ...
Anthropic's new model might also report users to authorities and the press if it senses "egregious wrongdoing." ...
Anthropic’s AI testers found that in these situations, Claude Opus 4 would often try to blackmail the engineer, threatening ...
In a fictional scenario, the model was willing to expose that the engineer seeking to replace it was having an affair.
Artificial intelligence firm Anthropic has revealed a startling discovery about its new Claude Opus 4 AI model.
Therefore, it urges users to be cautious in situations where ethical issues may arise. Antropic says that the introduction of ASL-3 to Claude Opus 4 will not cause the AI to reject user questions ...
Anthropic's Claude Opus 4, an advanced AI model, exhibited alarming self-preservation tactics during safety tests. It ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results