Containing an Elephant: How we moved Hadoop/HBase into Kubernetes and Public Cloud - Dhiraj Hegde

Показать описание

STRATEGIC
---------------
Google

PLATINUM
--------------
Apple
Huawei
Instaclustr
Tencent Cloud

GOLD
-----------
Aiven OY
AWS
Baidu
Cerner
Didi Chuxing
Dremio
Fiter
Gradle
Red Hat

We run a very large number of HBase and HDFS clusters in our data centers with multiple petabytes of data, billions of queries per day over thousands of machines. After more than a decade of operating our own data centers, we pivoted towards Public Cloud for its scalability and high availability features. As part of this effort, we made a bold decision to move our HBase and HDFS clusters from staid bare metal hosts to the dynamic and immutable world of containers and Kubernetes.

In this talk I will go over why we chose Kubernetes, the challenges we ran into with this choice and how we overcame those challenges. Some of these challenges include
1. Limitations in Kubernetes while managing large scale stateful applications
2. Failures experienced in HBase/HDFS in such environments
3. Adapting HBase/HDFS availability and durability to Kubernetes
4. Complexity of DNS in Public Cloud and its impact on HDFS/HBase clusters