Apache Druid Deep Dive

Показать описание

Talk abstract: Apache Druid is an open source analytics database powering fresh, fast analytics in companies from AirBnB to Zeotap on clickstream, telemetry, financial transactions, applications and more. In this talk, we open the box on the three distributed processes in Druid led by the coordinator, overlord, and broker, and the ways that these come together to deliver reliable, performant query, ingestion, and management services.

Bio: Jon King is a Sr. Field Engineer at Imply. Jon has been in big data for 13+ years and is fluent in Hadoop, Spark, Hive, Presto and Druid. Previously, he’s built and managed data teams at Solifire, NetApp and Ibotta. He’s a 2x O’reilly Author (Operationalizing Data Lakes in the Cloud (2019)) and Contributor (Programming Hive (2012)). Outside of work, he enjoys traveling and spending time with his family in the Colorado mountains.

RVA Data Engineering

Рекомендации по теме

Комментарии

wondering if there is any detail use case study on fintech, particularly in wealth management firm

luckyfarru

37:15 if we group by time how can we count distinct users - per dimension
Ie how many unique visitors did I have in recent 2 weeks but filter by dimension"treatment-1"

56:00 how will it handle revenue sales dollars where the data does not seem to fit dictionary bitmaps? How fast to sum per dimensions?

programminginterviewsprepa

Apache Druid Deep Dive

A truly technical introduction to Apache Druid

Software Guff Gaff Apache Druid deep dive

Apache Druid Deep Dive

TDPC Dec 2021: Deep Dive into Apache Druid

THIS MONDAY: Apache Druid Deep Dive (In person Meetup)

What Is Apache Druid And Why Do Companies Like Netflix And Reddit Use It?

Deep Dive into Druid Metrics

How can Apache Druid be so fast?

Inside Apache Druid’s storage and query engine

Inside Apache Druid'S Storage And Query Engine

Intro to Apache Druid

Learn Apache Druid for Data Analytics part 1

Why Apache Druid Can't Handle Modern Real-Time Analytics and What to Do About It

What’s inside an Apache Druid cluster?

Druid Summit 2021: Deep dive into Druid metrics with Nishant Bangarwa

Building Modern Analytics Applications with Apache Druid

Druid Summit 2022: Pushing the Limits of Apache Druid Real-time with Data Sharding

Apache Druid - Tech session

Druid vs Snowflake

Apache Druid Explained: Best of Both Worlds Architecture

Zeotap: Data Modeling in Druid for Non temporal and Nested Data

Apache Druid 26.X After Open Source For 10 Years

Apache Druid Explained

Apache Druid data optimisations