Claude 4 Reporting Ethics

News

AI Snitch? How Claude 4 Could Report You to Authorities

Can AI like Claude 4 be trusted to make ethical decisions? Discover the risks, surprises, and challenges of autonomous AI ...

BGR20d

Claude 4 AI will try to report you to authorities if it thinks you’re doing shady stuff

While this kind of ethical intervention and whistleblowing ... This doesn’t mean Claude 4 will suddenly report you to the police for whatever you’re using it for. But the “feature” has ...

AI models may report users’ misconduct, raising ethical concerns

Researchers observed that when Anthropic’s Claude 4 Opus model detected usage for “egregiously immoral” activities, given ...

HealthcareInfoSecurity18d

Claude Opus 4 is Anthropic's Powerful, Problematic AI Model

Startup Anthropic has birthed a new artificial intelligence model, Claude Opus 4, that tests show delivers complex reasoning ...

11d

When your LLM calls the cops: Claude 4’s whistle-blow and the new agentic AI risk stack

Claude 4’s “whistle-blow” surprise shows why agentic AI risk lives in prompts and tool access, not benchmarks. Learn the 6 ...

New York Post20d

AI model threatened to blackmail engineer over affair when told it was being replaced: safety report

The company stated that prior to these desperate and jarringly lifelike attempts to save its own hide, Claude will take ethical ... the safety report stated. Claude Opus 4 further attempted ...

Hosted on MSN21d

Anthropic Claude 4 models a little more willing than before to blackmail some users

Anthropic on Thursday announced the availability of Claude ... 4, the latest iteration of its Claude family of machine learning models.… Be aware, however, that these AI models may report ...

KHOU 1120d

Newly released AI resorted to 'extreme blackmail behavior' when threatened with replacement

The AI also “has a strong preference to advocate for its continued existence via ethical means, such as emailing pleas to key decisionmakers.” The choice Claude 4 made was part of the test ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results