Talk

How We Built a Data Lakehouse Platform on Apache Ozone

In Russian

A talk on how my team at "Ostrovok!" developed and launched Dota 2, the second version of our internal analytics platform created to replace the existing solution based on Vertica and HDFS. We will talk about the choice of architectural solutions and the transition to an S3-compatible infrastructure using Apache Ozone in conjunction with Trino, Spark and Iceberg.

We will talk in more detail about the topics:

  • Limitations of the previous platform and prerequisites for the transition.
  • Analysis of alternatives: why Apache Ozone was chosen as a storage compatible with the S3 API.
  • Features of Ozone integration with Trino, Spark and Iceberg.
  • Dota 2 architecture: from ingestion to analytics.
  • Practical experience in migrating pipelines and data.
  • Lessons, insights and pitfalls that we encountered along the way.

Speakers

Schedule