filmov
tv
SREcon23 Asia/Pacific - Mastering Chaos: Achieving Fault Tolerance with Observability-Driven...
Показать описание
Mastering Chaos: Achieving Fault Tolerance with Observability-Driven Prioritized Load Shedding
Harjot Gill and Hardik Shingala, FluxNinja, Inc.
Microservices-based applications are complex, with metastable failures like cascading failures and retry storms posing significant challenges. In this talk, we will explore these types of failures, the shortcomings of current state-of-the-art approaches, and introduce Aperture, a unique open-source tool for observability-driven prioritized load shedding.
Aperture enables graceful degradation of non-critical services, ensuring system stability. We'll delve into Aperture's innovative architecture, covering its control and data planes, and discuss how it employs token buckets, weighted fair queuing, and concurrency limiting to prioritize workloads effectively.
We will also share real-world results from implementing Aperture in cloud products, demonstrating its ability to protect multi-tenant databases from overloads through prioritized load shedding of GRPC and GraphQL traffic.
Join us on this journey as we unveil a powerful solution that addresses the limitations of current approaches, ensuring the reliability and resilience of your microservices-based applications.
Harjot Gill and Hardik Shingala, FluxNinja, Inc.
Microservices-based applications are complex, with metastable failures like cascading failures and retry storms posing significant challenges. In this talk, we will explore these types of failures, the shortcomings of current state-of-the-art approaches, and introduce Aperture, a unique open-source tool for observability-driven prioritized load shedding.
Aperture enables graceful degradation of non-critical services, ensuring system stability. We'll delve into Aperture's innovative architecture, covering its control and data planes, and discuss how it employs token buckets, weighted fair queuing, and concurrency limiting to prioritize workloads effectively.
We will also share real-world results from implementing Aperture in cloud products, demonstrating its ability to protect multi-tenant databases from overloads through prioritized load shedding of GRPC and GraphQL traffic.
Join us on this journey as we unveil a powerful solution that addresses the limitations of current approaches, ensuring the reliability and resilience of your microservices-based applications.