Lessons in building resilient systems at Amazon and Meta | Zuodong Xiang | Conf42 IM 2024

preview_player
Показать описание

Chapters
0:00 Introduction: What Can Possibly Go Wrong?
0:58 Real-Life Scenario: Flood of Traffic
3:01 Real-Life Scenario: Retry Storm
5:10 Real-Life Scenario: Plan B Went Poorly
6:24 Real-Life Scenario: Bad Commit
8:21 Real-Life Scenario: Lack of Sufficient Ownership
9:21 Real-Life Scenario: Script Errors
10:20 Prevention Strategies: Defensive Coding Practices
11:09 Logging and Error Handling Best Practices
12:35 Setting Effective Alerts
15:04 Mitigation Strategies for Alerts
15:46 Preparing for High Velocity Events
17:27 Conducting a Self Review
19:42 Conclusion and Takeaways
Рекомендации по теме