The Evolution of Apache Kafka: From In-House Infrastructure to Managed Cloud Service ft. Jay Kreps

preview_player
Показать описание

Kafka started out at LinkedIn as a distributed stream processing framework and was core to their central data pipeline. At the time, the challenge was to address scalability for real-time data feeds. The social media platform’s initial data system was built on Apache™Hadoop®, but the team later realized that operationalizing and scaling the system required a considerable amount of work.

When they started re-engineering the infrastructure, Jay observed a big gap in data streaming—on one end, data was being looked at constantly for analytics, while on the other end, data was being looked at once a day—missing real-time data interconnection. This ushered in efforts to build a distributed system that connects applications, data systems, and organizations for real-time data. That goal led to the birth of Kafka and eventually a company around it—Confluent.

Over time, Confluent progressed from focussing solely on Kafka as a software product to a more holistic view—Kafka as a complete central nervous system for data, integrating connectors and stream processing with a fully-managed cloud service.

Now as organizations make a similar shift from in-house infrastructure to fully-managed services, Jay outlines five guiding points to keep in mind:
1. Cloud-native systems abstract away operational efforts for you without infrastructure concerns
2. It’s important to have a complete ecosystem for Kafka, including connectors, a SQL layer, and data governance
3. A distributed system should allow data to be accessible everywhere and across organizations
4. Identifying a reliable storage infrastructure layer that is dependable, such as Amazon S3 is critical
5. Cost-effective models mean sustainability and systems that are easy to build around

EPISODE LINKS

ABOUT CONFLUENT

#cloudnative #apachekafka #kafka #confluent
Рекомендации по теме
Комментарии
Автор

"Jenkins on Kafka", can't make that up ;-)

Aleamanic