Oleksandr Pryymak - Probabilistic Data Structures & Approximate Solutions

Показать описание

IPython Notebook:

View slides here:

Will your decisions change if you'll know that the audience of your website isn't 5M users, but rather 5'042'394'953? Unlikely, so why should we always calculate the exact solution at any cost? An approximate solution for this and many similar problems would take only a fraction of memory and runtime in comparison to calculating the exact solution.

This tutorial is a practical survey of useful probabilistic data structures and algorithmic tricks for obtaining approximate solutions. When should we use them, and when we should not trade accuracy for scalability. In particular, we start with hashing and sampling; address the problems of comparing and filtering sets, counting the number of unique values and their occurrences; touch basic hashing tricks used in machine learning algorithms. Finally, we analyse some examples of their usage show the full power: how to organise an online analytics, or how to decode a DNA sequence by squeezing a large graph into a bloom filter. 00:00 Welcome!
00:10 Help us add time stamps or captions to this video! See the description for details.

Рекомендации по теме

Oleksandr Pryymak - Probabilistic Data Structures & Approximate Solutions

Oleksandr Pryymak - Probabilistic Data Structures & Approximate Solutions

Oleksandr Pryymak: Information surprise or how to find data

Graph Neural Networks by Oleksandr Pryymak - LvDS 2020

Devoxx Ukraine 2019: Probabilistic data structures in nutshell [UA] - Oleksandra Kulyk

Quick Interview: Oleksandr Pryymak

Bartosz Adamczewski. Probabilistic Data Structures

Using Probabilistic Data Structures to reduce costs by João Neves & Carlos Rodrigues

High-Performance Analytics with Probabilistic Data Structures: the Power of HyperLogLog

TALK / Simon Prickett / No, Maybe and Close Enough: Using Probabilistic Data Structures in Python

Introduction to Probabilistic Data Structures (Aggelos Karalias, Lead Software Engineer at Logicea)

'Probabilistic Data Structures' By Guy Royse

Real Time Log Analytics Using Probabilistic Data Structures in Redis

Berlin Buzzwords 2016: James Stanier - Acceptably Inaccurate: Probabilistic Data Structures #bbuzz

James Stanier - Probabilistic Data Structures

Probabilistic data structures | Eugen Fedchenko, Zoolatech CTO.

Understanding Probabilistic Data Structures with 112,092 UFO Sightings - Guy Royse - NDC London 2023

Understanding Probabilistic Data Structures with 112,092 UFO Sightings by Guy Royse

Probabilistic Data Structures, Dylan Meeus

[CohPy] March 2020 - Understanding Probabilistic Data Structures with 112,092 UFO Sightings

SE-Radio Episode 358: Probabilistic Data Structure for Big Data Problems

Streaming lossy compression of biological sequence data using probabilistic data structures

Mark Grundland - Winning Ways for Your Visualization Plays

'Understanding Probabilistic Data Structures with 112,092 UFO Sightings' by Guy Royce

Programming Interview: Bloom Filter (probabilistic data structure) Implementation