Talks

  • The program hasn’t been finally approved yet, so there still might be some changes.

  • Talk

    StarRocks: the Reality of the Modern Data Platform

    The data platform in our company has existed for more than 5 years, during this time it has absorbed a lot of trendy (and not so trendy) solutions. I will tell you how we tried to choose our future among ClickHouse, Greenplum and Trino, and found StarRocks. 

  • Talk

    What Metastore Is

    What metastore is, how it works in the big data ecosystem, what solutions exist on the market and why we decided to develop our own. I will share practical experience, architecture and lessons we have learned.

  • Talk

    Third Party Runtime Engines for Apache Spark: Experience of Using

    Experience of using Comet and Gluten (Velox) execution engines – from the introduction and features of the build to the results of testing on real ETLs. I will tell you about pitfalls and non-obvious points, show the results of work and consider cases when these engines are useful and when they don't work at all.

  • Talk

    Vector Search Algorithms in YDB

    YDB has undergone a significant development path from applying basic vector search techniques to creating a scalable and efficient vector index. The talk presents a detailed analysis of the stages of evolution of vector search in YDB, including analysis of complexities and engineering solutions. 

  • Talk

    DataRentgen: What’s Wrong With OSS Data Catalog and How To Make It Better

    Description of the path of developing an open source data lineage solution based on OpenLineage + Kafka + FastStream + FastAPI. Comparison with other open source solutions (OpenMetadata, DataHub, Marquez, OpenAtlas) and why we abandoned them in favor of our own development. No, this is not another custom Data Catalog :)

We will add more talks soon.

We are actively adding to the program. Sign up for our newsletter to stay informed.

Subscribe