TalkDate: 12.10 / Start: 00:00 – Finish: 00:00

How to design a high-performance distributed SQL engine

In RussianComplexity -

Distributed SQL engines must process data across multiple servers. In this talk, Vladimir will tell, using Apache Flink and Presto as an example, how distributed SQL engines are arranged, and what approaches they use to increase query performance.

During this session we'll see:

architecture of a distributed relational operators, like aggregate, sort, join;
partitioning data in a cluster to minimize data transfer between nodes;
use of cost-based optimizers to find optimal execution plans;
splitting complex plans into independent fragments, and organizing data transfer between them;
advanced techniques: compilation, vectorization, pruning.

#queryengine
#queryoptimization
#tooling

Speakers

Vladimir Ozerov
Querify Labs
Alexey Goncharuk
Querify Labs

Schedule

How to design a high-performance distributed SQL engine

Speakers

Vladimir Ozerov

Alexey Goncharuk