Anthropic offers $20,000 to whoever can jailbreak its new AI safety system

The company has upped its reward for red-teaming Constitutional Classifiers. Here's how to try.

Feb 6, 2025 - 19:21
 0
Anthropic offers $20,000 to whoever can jailbreak its new AI safety system
The company has upped its reward for red-teaming Constitutional Classifiers. Here's how to try.