Evolution of fault tolerance

preview_player
Показать описание
Author: Ken Birman

Abstract:

Ken Birman's talk focused on controversies surrounding fault-tolerance and consistency. Looking at the 1990's, he pointed to debate around the so-called CATOCS question (CATOCS refers to causally and totally ordered communication primitives) and drew a parallel to the more modern debate about consistency at cloud scale (often referred to as the CAP conjecture). Ken argued that the underlying tension is actually one that opposes basic principles of the field against the seemingly unavoidable complexity of mechanisms strong enough to solve consensus, particularly the family of protocols with Paxos-like structures. Over time, this was resolved: He concluded that today, we finally know how to build very fast and scalable solutions (those who attended SOSP 2015 itself saw ten or more of the paper on such topics). On the other hand, Ken sees a new generation of challenges on the horizon: cloud-scale applications that will need a novel mix of scalable consistency and real-time guarantees, will need to leverage new new hardware options (RDMA, NVRAM and other "middle memory" options), and may need to be restructured to reflect a control-plane/data-plane split. These trends invite a new look at what has become a core topic for the SOSP community.

Рекомендации по теме