Shrinking Production Incidents - Dan Adams (Google SRE)

preview_player
Показать описание
Shrinking Production incidents takes a look at how to learn from production outages and how to use a structured approach to reduce their overall impact.

Dan has been working at Google as a Site Reliability Engineer for two years, working on Google's globally distributed high QPS key value database. Prior to that, Dan has worked in several industries making wireless communication and positioning hardware, mining robotics and the games industry.
Рекомендации по теме