Talk type: Talk

Fast data processing in Data Lake with Trino

  • Talk in Russian
Presentation pdf

The speaker will cover the implementation and practical use of key optimizations that allow Trino and related commercial products to quickly "grind" data from your lake: using Parquet and ORC metadata to reduce the amount of read-out data (project/filter/aggregate pushdown), dynamic filtering (runtime filtering), late materialization of columns (late materialization), and as many as three local caches: metadata cache, data cache and intermediate query results cache.

  • #trino
  • #cedrusdata
  • #optimization


Invited experts