🚨 AI Evals Crisis: Officially kicking off the Eval Science Workstream 🚨 We’re building a shared scientific foundation for evaluating AI systems, one that’s rigorous, open, and grounded in real-world & cross-disciplinary best practices. Read our new blog post.