Talk

Data Streaming Lakehouse: How to Stream Data Into Paimon and Not Drown

In Russian

In this talk, we will explore practical experience in building a Data Streaming Lakehouse where data is updated and becomes available for analytics in near real-time. We will walk through the entire pipeline from source to data mart: how to set up continuous change data capture (CDC) and load data into distributed storage while maintaining fast read performance for end-users.

Technologies: MySQL (source), Apache Flink (stream processing), Apache Paimon (table format), HDFS (storage layer), StarRocks (MPP/OLAP engine for consumption).

Target Audience: Data Engineers, Data Architects, and DWH Developers.

Speakers

Talks