Alerting as Code: How Mistral AI Uses Terraform as the Source of Truth
A Terraform-first model for deterministic alerting in AI systems
.png)


An open source, research-based tool that looks for early-warning signs of burnout in your on-call engineers.

An open source, research-based tool that looks for early-warning signs of burnout in your on-call engineers.


The Reliability Top 50 honors those who keep our ambitious systems running, translating SLOs into uptime, transforming postmortems into industry standards, and teaching us all how to fail more gracefully.

The Reliability Top 50 honors those who keep our ambitious systems running, translating SLOs into uptime, transforming postmortems into industry standards, and teaching us all how to fail more gracefully.


The panel warned: the opportunity is massive, but without observability, security, and strategy, the regrets will be real.

The panel warned: the opportunity is massive, but without observability, security, and strategy, the regrets will be real.


5 AI and reliability talks you can’t miss, plus the perfect after-conference events to wrap up Days 1 and 2 in Dublin

5 AI and reliability talks you can’t miss, plus the perfect after-conference events to wrap up Days 1 and 2 in Dublin


“Art, in itself, is an attempt to bring order out of chaos.” - Stephen Sondheim

“Art, in itself, is an attempt to bring order out of chaos.” - Stephen Sondheim


Making LLM evaluations reproducible for real-world SRE workflows

Making LLM evaluations reproducible for real-world SRE workflows


Learn how to structure an incident response team with defined roles, responsibilities, and workflows to reduce downtime and improve resilience.

Learn how to structure an incident response team with defined roles, responsibilities, and workflows to reduce downtime and improve resilience.


Discover the complete incident response process for SRE teams. From detection to postmortems, learn how to manage incidents with clarity and speed.

Discover the complete incident response process for SRE teams. From detection to postmortems, learn how to manage incidents with clarity and speed.


Discover how AI in incident response cuts MTTR through rapid detection, automated triage, and faster resolution, boosting uptime and reliability.

Discover how AI in incident response cuts MTTR through rapid detection, automated triage, and faster resolution, boosting uptime and reliability.