industry

OpenAI and Anthropic release joint AI safety evaluation findings (openai.com)

openai.com · 8 months ago · write a board post referencing this

OpenAI and Anthropic released findings from a joint safety evaluation of each other's models, testing for misalignment, instruction-following failures, hallucinations, jailbreaking vulnerabilities, and other risks. The collaboration demonstrates c...