Talk type: Talk
How We Adapted Dynamic YTsaurus Tables to Store Blobs
To improve the efficiency of YTsaurus, the team decided to remove blobs and store them separately from "normal" tabular data. They had to modify compaction algorithms in a special way to be able to collect "garbage" among the blocks and to provide a suitable tradoff between the disk space (space amplification) and the amount of permanently overwritten data (write amplification). They also took an approach to a number of tables, which were kept in RAM. As a result, we moved (under the guise of blobs!) some of their data to disks and reduced RAM consumption by several times, while maintaining low read times at high quantiles. In the process of implementation, the IO-stack had to be significantly improved by switching to io_uring, and the block-storage layer by adding a consistent hashing algorithm to choose the method of data replicas arrangement.
Company: PostgreSQL JDBC committer