How to Survive Ad Hoc and Interactive Analytics With Big Data❓

Показать описание

Is there anything more frustrating than waiting and waiting and waiting for a query to complete? And yet, most solutions take sooooooo long!
‍
It doesn’t need to be that way. Learn how to deliver ad hoc speed at big data scale. This deep dive goes into techniques and more common best practices for satisfying the need for ad hoc speed across:

▪️ Schema design and ELT
▪️ Data storage and access optimization
▪️ Query optimization
▪️ Indexing

We’ll compare a few technologies including Athena and Firebolt, and do some demos along the way to show the difference.

Timestamps:

0:00 Intro
3:18 Challenges with ad hoc interactive analytics
10:39 What is needed for ad hoc and interactive analytics
20:47 Decoupled storage and compute
21:46 Self-service ELT from the data lake
24:12 Self service optimization
26:08 Complete isolation of changes
27:01 Sub-second queries at scale
29:07 High-performance joins
30:31 Fast groupby, filter operations
30:22 Support for semi-structured analytics
30:41 High user/query concurrency
38:32 Q&A

More on ad hoc analytics: