Site Reliability Engineering at Google • Christof Leng • GOTO 2017

Показать описание

This presentation was recorded at GOTO Berlin 2017

Christof Leng - Senior Site Reliability Engineer at Google @ChristofLeng

ABSTRACT
Site reliability engineers are Google's experts for operating its tech infrastructure and products. They need to keep up with the enormous scale, rapid growth, and daunting complexity of Google's systems landscape. As traditional methods would not work, SRE treats operations as if [...]

TIMECODES
0:00 Introduction
3:26 Reliability is easy to take for granted
6:24 What is Site Reliability Engineering (SRE)?
9:15 Part I: Dev and Ops
13:49 Is conflict inevitable?
14:48 Service Level Agreement (SLA)
20:19 What do you spend your budget on?
21:09 The rule
22:18 Two nice features of Error Budgets
24:08 Part II: Staffing, Work, Ops Overload
28:55 SRE hires only coders
31:05 50% cap on Ops work
32:09 Keep DEV in the rotation
34:09 Speaking of Dev and Ops work...
35:21 SRE Portability
37:24 Part III: Death, taxes, and outages...
39:07 Minimize Damage
40:59 A word on practice...
41:16 Wheel of Misfortune
43:22 Prevent recurrence
44:21 Post-mortem philosophy
46:13 Summary
47:00 O'Reilly Book

Read the full abstract here:

RECOMMENDED BOOKS

Looking for a unique learning experience?

SUBSCRIBE TO OUR CHANNEL - new videos posted almost daily.

Рекомендации по теме

Комментарии

I am a python developer and I am going to start working as an SRE... wish me luck... it's going to be an exciting new path :)

maximilianoromayfigueroa

The arrogance of these people is just unbounded.

TheCALMInstitute

Site Reliability Engineering at Google • Christof Leng • GOTO 2017

Meet Site Reliability Engineers at Google

Site Reliability Engineering at Google • Christof Leng • GOTO 2018

Google SRE virtual on-site interview: Part 1 | How to Prep | Why SRE? | Study Guide

Site Reliability Engineering at Google • Christof Leng • GOTO 2017

What is Site Reliability Engineering (SRE)?

Site Reliability Engineering (SRE) Fundamentals

What's the Difference Between DevOps and SRE? (class SRE implements DevOps)

Site Reliability Engineering: How Google Runs… by Betsy Beyer · Audiobook preview

Apache JMeter Tutorials: Learn Apache JMeter in Just 2 Hours Part-22 - 2024

SLIs, SLOs, SLAs, oh my! (class SRE implements DevOps)

My SRE Interview Experience with Apple London & Google Warsaw

How to Adopt Site Reliability Engineering (SRE) on Google Cloud

Solving Reliability Fears with Site Reliability Engineering (Cloud Next '18)

DevOps vs SRE vs Platform Engineering | Clear Big Misconceptions

What is SRE | Tasks and Responsibilities of an SRE | SRE vs DevOps

DevOps Vs. SRE: Competing Standards or Friends? (Cloud Next '19)

Getting Started with Site Reliability Engineering - Google

Site Reliability Engineering (SRE) Best Practices with Google

DevOps Vs. SRE: Competing Standards or Friends? (Next '19 Rewind)

Site Reliability Engineering: Aligning developers and operators for better DevOps

Site reliability engineering, IAM Recommender, & more!

An Introduction to Site Reliability Engineering at Google - Jennifer Petoff

Eimear Crotty, Site Reliability Engineer at Google

Google Cloud Summit Paris - Site Reliability Engineering