'I See What You Mean' by Peter Alvaro

preview_player
Показать описание
I love query languages for many reasons, but mostly because of their semantics. Wait, come back! In contrast to most systems programming languages (whose semantics can be quite esoteric), the semantics of a query (given some inputs) are precisely its outcome -- rows in tables. Hence when we write a query, we directly engage with its semantics: we simply say what we mean. This makes it easy and natural to reason about whether our queries are correct: that is, whether they mean what we intended them to mean.

Query languages have traditionally been applied to a relatively narrow domains: historically, data at rest in data stores; more recently, data in motion through continuous, "streaming" query frameworks. Why stop here? Could query languages do for a notoriously complex domain such as distributed systems programming what they have done so successfully for data management? How would they need to evolve to become expressive enough to capture the programs that we need to write, while retaining a simple enough semantics to allow mere mortals to reason about their correctness?

I will attempt to answer these questions (and raise many others) by describing a query language for distributed programming called Dedalus. Like traditional query languages, Dedalus abstracts away many of the details we typically associate with programming, making data and time first-class citizens and relegating computation to a subordinate role, characterizing how data is allowed to change as it moves through space and time. As we will see, this shift allows programmers to directly reason about distributed correctness properties such as consistency and fault-tolerance, and lays the foundations for powerful program analysis and repair tools (such as Blazes and LDFI), as well as successive generations of data-centric programming languages (including Bloom, Edelweiss and Eve).

Peter Alvaro

UNIVERSITY OF CALIFORNIA SANTA CRUZ

@palvaro
Peter Alvaro is an Assistant Professor of Computer Science at the University of California Santa Cruz. His research focuses on using data-centric languages and analysis techniques to build and reason about data-intensive distributed systems, in order to make them scalable, predictable and robust to the failures and nondeterminism endemic to large-scale distribution. Peter is the creator of the Dedalus language and co-creator of the Bloom language.

Рекомендации по теме
Комментарии
Автор

This is one of those talks that makes me google for a few hours to barely get to the meat of the ideas presented! Really awesome stuff!

coolsebz
Автор

Wow, that was a mind-bender! Great talk. Looking at abstractions like time and the messy stuff that distributed systems give us will someday, hopefully, at UCSC first, make reasoning easy and natural. I suggest reading Stephen Toulmin on the philosophy side of the topic. He shows where/how the original problem of hiding abstractions took us down the wrong road. Glad to see Peter Alvaro working on re-integrating the world with new languages and respectful design. Bravo!

bonnydonny
Автор

It's almost stand-up comedy fused with hardcore tech.. great talk!

thomas.moerman
Автор

This talk gave me a nosebleed, two thumbs up

_gunna
Автор

Computation is rendezvous of ephemera... nice.

jjurksztowicz
Автор

Extremely profound guy. Almost like Michael Parenti in politics.

valtih
Автор

One that confused me is that he talked about Datalog using examples only reading data, which is actually quite easy even for usual languages. Difficult part seems to be that the order of the sequence needs to be guaranteed, mostly because of side effects of some sorts. Am I missing something?

clementdato
Автор

i wonder how long before this or at least the ideas here gets seen in production distributed systems.

arhyth
Автор

A distributed secure system wold be similar to a blockchain system. It's must not support data deletion nor data updates. It must be a purely constructive system. A deletion must be just a new annotation about a state of some data. But this type of system may grow much, so we can keep the chain of changes but in the memory work with a limited version with only the current data for better performance. But the construction of new data based on old data must also be perfectly deterministic...

supersearch
Автор

Wouldn't it be fair to say that blockchains provide "the god line" and this is how they solve this fundamental distributed system problem?

chromosundrift