Download PNG
Download SVG
Page only when it matters with Deferred Paging.
A council, a sword, and a fleet of agents: how I ship code now
Your incident history is a graph waiting to be built. Here's how to turn Rootly data into a knowledge graph that surfaces patterns, predicts blast radius, and gives your AI SRE actual memory.
Sylvain Kalache
It’s downloaded ~7,000 times a month, it’s running in production at companies you’ve definitely heard of.
Rootly Academy is incident response training designed the way engineers actually learn: by working through realistic scenarios under pressure.
Anthropic released Claude Sonnet-4.6, and we ran it through SRE-skills-bench the same day. It tests models on the tasks SREs actually do: understanding infrastructure code, reasoning about cloud configurations, and mapping code diffs to real-world pull requests.
Introducing On-Call Health, an open-source way of detecting responder overload.
A shift just happened in SRE AI performance. Gemini 3 Pro didn’t just edge out OpenAI’s models, it beat them across every SRE task we threw at it. The landscape is changing faster than anyone expected.
The new edition of our benchmark features Terraform tasks across AWS, GPC, and Azure, plus incorporates a new dimension: prompt-optimization.
An open source, research-based tool that looks for early-warning signs of burnout in your on-call engineers.
5 AI and reliability talks you can’t miss, plus the perfect after-conference events to wrap up Days 1 and 2 in Dublin
Making LLM evaluations reproducible for real-world SRE workflows
Reliability engineering is evolving quickly—and AI is the catalyst. That’s why we’re excited to unveil Rootly AI Labs, a community-focused program dedicated to reshaping reliability through open collaboration, innovative prototypes, and cutting-edge research.
Rootly AI Labs analyzes the performance of Meta’s Llama 4 models and finds they underperform compared to competitors like Claude 3.5 Sonnet and Qwen2.5