Building a Streaming Microservice Architecture: with Apache Spark Structured Streaming and Friends

Показать описание

As we continue to push the boundaries of what is possible with respect to pipeline throughput and data serving tiers, new methodologies and techniques continue to emerge to handle larger and larger workloads – from real-time processing and aggregation of user / behavioral data, rule-based / conditional distribution of event and metric streams, to almost any data pipeline / lineage problems. These workloads are typical in most modern data platforms and are critical to all operational analytics systems, data storage systems, ML / DL and beyond. One of the common problems I’ve seen across a lot of companies can be reduced to general data reliability problems. Mainly due to scaling and migrating processing components as a company expands and teams grow. What was a few systems can quickly fan out into a slew of independent components and serving-layers all whom need to be scaled up, down or out with zero-downtime to meet the demands of a world hungry for data. During this technical deep dive, a new mental model will be built up which aims to reinvent how one should build massive, interconnected services using Kafka, Google Protocol Buffers / gRPC, and Parquet/Delta Lake/Spark Structured Streaming. The material presented during the deep dive is based on lessons learned the hard-way while building up a massive real-time insights platform at Twilio where data integrity and stream fault-tolerance is as critical as the services our company provides.

About:
Databricks provides a unified data analytics platform, powered by Apache Spark™, that accelerates innovation by unifying data science, engineering and business.

Connect with us:

Рекомендации по теме

Комментарии

Thanks for doing this. Excellent overview of all the pieces involved. We are using similar architecture with protobuf/Kafka/pyspark for standardizing our data engg pipelines.

PokeRowlet

Building a Streaming Microservice Architecture: with Apache Spark Structured Streaming and Friends

Building a Streaming Microservice Architecture: with Apache Spark Structured Streaming and Friends

Microservices Explained in 5 Minutes

Stream Processing System Design Architecture

Building Streaming Microservices with Apache Kafka - Tim Berglund

Redis + Microservices Architecture in 60 Seconds

Creating event-driven microservices: the why, how and what by Katherine Stanley

Design Microservice Architectures the Right Way

Building event-driven (Micro)Services with Apache Kafka by Guido Schmutz

Rethinking Microservices with Stateful Streams • Ben Stopford • GOTO 2017

Event-Driven Architecture: Explained in 7 Minutes!

How Video Streaming works | System Design

Fast Data Architectures for Streaming Applications • Dean Wampler • GOTO 2017

Kafka in 100 Seconds

What Are Microservices Really All About? (And When Not To Use It)

Pain and gain Pain and gain of introducing Kafka in Microservices architecture at eBay

Streaming Microservices with Akka Streams and Kafka Streams

Everything You NEED to Know About WEB APP Architecture

Kafka Streams + Mesos for Highly Scalable Microservices

High-volume PDF creation within a streaming microservice architecture

Webinar: Building 12 factor streaming data apps on Kubernetes

Benjamin Stopford - Building Streaming Microservices with Apache Kafka | Øredev 2018

RabbitMQ in 100 Seconds

Building Event Driven Services with Apache Kafka and Kafka Streams by Ben Stopford

Apache Kafka and KSQL in Action : Let’s Build a Streaming Data Pipeline! by Robin Moffatt