Start of main content

Trino (Presto) DB: Zero copy lakehouse

Day 1

RU

PrestoDB (and Trino) was created by Facebook to provide interactive access to data stored in Hive/Hadoop. Since then Hadoop has been declared dead, Spark and S3 are replacing Hive and HDFS, and the industry is talking about microservices, IoT, and multi/hybrid clouds. But analytic requests are slow, ETL still takes overnight and most of the data is unnecessary. And they changed the data schema in the source again! Can we not copy? Yes — use Trino or its competitors (Dremio, Drill). Work with the data from primary sources, combining and enriching them, subsecond requests. Of course, there are problems, but we have solutions too. We'll talk about that, hidden opportunities, new functionalities, what we have in a project, or his forks.

This talk is for data engineers and cloud system architects and offers an interesting for integrating data sources.

  • #queryoptimization
  • #datavirtualisation
  • #queryengine
  • #tooling

Speakers

Invited experts