Kirill Romanikhin
Place.01
In this talk, we will explore practical experience in building a Data Streaming Lakehouse where data is updated and becomes available for analytics in near real-time. We will walk through the entire pipeline from source to data mart: how to set up continuous change data capture (CDC) and load data into distributed storage while maintaining fast read performance for end-users.
Technologies: MySQL (source), Apache Flink (stream processing), Apache Paimon (table format), HDFS (storage layer), StarRocks (MPP/OLAP engine for consumption).
Target Audience: Data Engineers, Data Architects, and DWH Developers.
Place.01