Front cover image for The site reliability workbook : practical ways to implement SRE

The site reliability workbook : practical ways to implement SRE

Betsy Beyer (Editor), Niall Richard Murphy (Editor), David K. Rensin (Editor), Kent Kawahara (Editor), Stephen Thorne (Editor)
Google’s Site Reliability Engineering book ignited an industry discussion on what it means to run production services today. Now, Google engineers who worked on that bestseller introduce The Site Reliability Workbook, a hands-on companion that uses concrete examples to show you how to put SRE principles and practices to work in your environment.
Print Book, English, 2018
O'Reilly Media, Sebastopol, CA, 2018
xxx, 474 pages : illustrations ; 24 cm
9781492029502, 1492029505
1029786800
How SRE relates to DevOps
Part 1. Foundations. Implementing SLOs
SLO engineering case studies
Monitoring
Alerting on SLOs
Eliminating toil
Simplicity
Part 2. Practices. On-call
Incident response
Postmortem culture : learning from failure
Managing load
Introducing non-abstract large system design
Data processing pipelines
Configuration design and best practices
Configuration specifics
Canarying releases
Part 3. Processes. Identifying and recovering from overload
SRE engagement model
SRE : reaching beyond your walls
SRE team lifecycles
Organizational change management in SRE
Example SLO document
Example error budget policy
Results of postmortem analysis
"Companion to the bestselling SRE book"--Cover