'Druid: Powering Interactive Data Applications at Scale' by Fangjin Yang

Показать описание

Cluster computing frameworks such as Hadoop or Spark are tremendously beneficial in processing and deriving insights from data. However, long query latencies make these frameworks sub-optimal choices to power interactive applications. Organizations frequently rely on dedicated query layers, such as relational databases and key/value stores, for faster query latencies, but these technologies suffer many drawbacks for analytic use cases. In this session, we discuss using Druid for analytics, and why the architecture is well suited to power analytic applications.

User facing applications are replacing traditional reporting interfaces as the preferred means for organizations to derive value from their datasets. In order to provide an interactive user experience, user interactions with analytic applications must complete in an order of milliseconds. To meet these needs, organizations often struggle with selecting a proper serving layer. Many serving layers are selected because of their general popularity, without understanding the possible architecture limitations.

Druid is an analytics data store designed for analytic (OLAP) queries on event data. It draws inspiration from Google's Dremel, Google's PowerDrill, and search infrastructure. Many large technology companies are switching to Druid for analytics, and we will cover why the technology is a good fit for its intended use cases.

Рекомендации по теме

Комментарии

Excellent presentation on DruId .. Really Helpful

anilkumarupputuri

Unfortunately rabbitmq streaming is not supported anymore, it force companies move to kafka

michaeldeng

I agree star schema with relational databases is becoming outdated and inefficient but maybe he took it too far saying that companies don't use it anymore

sabikerickssohn

but how derive to the application? i mean deployment especially locally on Mac osx

bakyayita

@Override
public <T> QueryRunner<T> getQueryRunner(Query<T> query)
{
throw new query me, bro.");
}

richardstartin

'Druid: Powering Interactive Data Applications at Scale' by Fangjin Yang

'Druid: Powering Interactive Data Applications at Scale' by Fangjin Yang

Building Data Applications with Apache Druid

Interactive Exploratory Analytics with Druid | DataEngConf SF '17

What happens once data is in Apache Druid?

Interactive real-time dashboards on data streams using Kafka, Druid, and Superset

Interactive real time dashboards on data streams using Kafka, Druid, and Superset

How Apache Druid addresses the 5 requirements of data applications

Druid Summit 2022: Powering Clickstream Analytics at TrueCar with Apache Druid

Druid and Hive together: interactive realtime analytics at scale

Druid and Hive Together : Use Cases and Best Practices

Lyft's Apache Druid uses cases (interactive web apps, data exploration, time series datastore)

Phenomenal scaling: itty bitty living space: scaling Apache Druid as a multi-tenant platform

Druid Interactive Queries Meet Real-Time Data Eric Tschetter and Danny Yuan

Pinterest: Powering Ad Analytics with Apache Druid

Maximizing Apache Druid performance: Beyond the basics

Apache Druid Use Cases

Advanced Real-Time And Batch Analytics Using Apache Druid

Druid 0.16.0 Quickstart

How Superset and Druid Power Real-Time Analytics at Airbnb | DataEngConf SF '17

Where Druid Fits in Your Streaming Architecture

Demonstrating Apache Druid Rollup

What’s inside an Apache Druid cluster?

When Should I use Apache Druid?

Enable Interactive Big Data Analytics with Power BI on Massive Datasets Using Kyligence