Talk type: Talk

How We Adapted Dynamic YTsaurus Tables to Store Blobs

  • Talk in Russian
Presentation pdf

To improve the efficiency of YTsaurus, the team decided to remove blobs and store them separately from "normal" tabular data. They had to modify compaction algorithms in a special way to be able to collect "garbage" among the blocks and to provide a suitable tradoff between the disk space (space amplification) and the amount of permanently overwritten data (write amplification). They also took an approach to a number of tables, which were kept in RAM. As a result, we moved (under the guise of blobs!) some of their data to disks and reduced RAM consumption by several times, while maintaining low read times at high quantiles. In the process of implementation, the IO-stack had to be significantly improved by switching to io_uring, and the block-storage layer by adding a consistent hashing algorithm to choose the method of data replicas arrangement.

  • #ytsaurus
  • #blob
  • #storage


Invited experts