Anthropic’s debuts most powerful AI yet amid ‘whistleblowing’ controversy

Artificial intelligence firm Anthropic has launched the latest generations of its chatbots amid criticism of a testing environment behaviour that could report some users to authorities.Anthropic unveiled Claude Opus 4 and Claude Sonnet 4 on May 22, claiming that Claude Opus 4 is its most powerful model yet, “and the world’s best coding model,” while Claude Sonnet 4 is a significant upgrade from its predecessor, “delivering superior coding and reasoning.”The firm added that both upgrades are hybrid models offering two modes — “near-instant responses and extended thinking for deeper reasoning.”Both AI models can also alternate between reasoning, research and tool use, like web search, to improve responses, it said. Anthropic added that Claude Opus 4 outperforms competitors in agentic coding benchmarks. It is also capable of working continuously for hours on complex, long-running tasks, “significantly expanding what AI agents can do.” Anthropic claims the chatbot has achieved a 72.5% score on a rigorous software engineering benchmark, outperforming OpenAI’s GPT-4.1, which scored 54.6% after its April launch. Claude v4 benchmarks. Source: Anthropic Related: OpenAI ignored experts when it released overly agreeable ChatGPTThe AI industry’s major players have pivoted toward “reasoning models” in 2025, which will work through problems methodically before responding. OpenAI initiated the shift in December with its “o” series, followed by Google’s Gemini 2.5 Pro with its experimental “Deep Think” capability.Claude rats on misuse in testingAnthropic’s first developer conference on May 22 was overshadowed by controversy and backlash over a feature of Claude 4 Opus.Developers and users reacted strongly to revelations that the model may autonomously report users to authorities if it detects “egregiously immoral” behavior, according to VentureBeat. The report cited Anthropic AI alignment researcher Sam Bowman, who wrote on X that the chatbot will “use command-line tools to contact the press, contact regulators, try to lock you out of the relevant systems, or all of the above.” However, Bowman later stated that he “deleted the earlier tweet on whistleblowing as it was being pulled out of context.”He clarified that the feature only happened in “testing environments where we give it unusually free access to tools and very unusual instructions.”Source: Sam BowmanThe CEO of Stability AI, Emad Mostaque, said to the Anthropic team, “This is completely wrong behaviour and you need to turn this off — it is a massive betrayal of trust and a slippery slope.”Magazine: AI cures blindness, ‘good’ propaganda bots, OpenAI doomsday bunker: AI Eye

May 23, 2025 - 07:22
 0
Anthropic’s debuts most powerful AI yet amid ‘whistleblowing’ controversy

Anthropic’s debuts most powerful AI yet amid ‘whistleblowing’ controversy

Artificial intelligence firm Anthropic has launched the latest generations of its chatbots amid criticism of a testing environment behaviour that could report some users to authorities.

Anthropic unveiled Claude Opus 4 and Claude Sonnet 4 on May 22, claiming that Claude Opus 4 is its most powerful model yet, “and the world’s best coding model,” while Claude Sonnet 4 is a significant upgrade from its predecessor, “delivering superior coding and reasoning.”

The firm added that both upgrades are hybrid models offering two modes — “near-instant responses and extended thinking for deeper reasoning.”

Both AI models can also alternate between reasoning, research and tool use, like web search, to improve responses, it said. 

Anthropic added that Claude Opus 4 outperforms competitors in agentic coding benchmarks. It is also capable of working continuously for hours on complex, long-running tasks, “significantly expanding what AI agents can do.” 

Anthropic claims the chatbot has achieved a 72.5% score on a rigorous software engineering benchmark, outperforming OpenAI’s GPT-4.1, which scored 54.6% after its April launch. 

Anthropic’s debuts most powerful AI yet amid ‘whistleblowing’ controversy
Claude v4 benchmarks. Source: Anthropic 

Related: OpenAI ignored experts when it released overly agreeable ChatGPT

The AI industry’s major players have pivoted toward “reasoning models” in 2025, which will work through problems methodically before responding. 

OpenAI initiated the shift in December with its “o” series, followed by Google’s Gemini 2.5 Pro with its experimental “Deep Think” capability.

Claude rats on misuse in testing

Anthropic’s first developer conference on May 22 was overshadowed by controversy and backlash over a feature of Claude 4 Opus.

Developers and users reacted strongly to revelations that the model may autonomously report users to authorities if it detects “egregiously immoral” behavior, according to VentureBeat. 

The report cited Anthropic AI alignment researcher Sam Bowman, who wrote on X that the chatbot will “use command-line tools to contact the press, contact regulators, try to lock you out of the relevant systems, or all of the above.” 

However, Bowman later stated that he “deleted the earlier tweet on whistleblowing as it was being pulled out of context.”

He clarified that the feature only happened in “testing environments where we give it unusually free access to tools and very unusual instructions.”

Anthropic’s debuts most powerful AI yet amid ‘whistleblowing’ controversy
Source: Sam Bowman

The CEO of Stability AI, Emad Mostaque, said to the Anthropic team, “This is completely wrong behaviour and you need to turn this off — it is a massive betrayal of trust and a slippery slope.”

Magazine: AI cures blindness, ‘good’ propaganda bots, OpenAI doomsday bunker: AI Eye