Talk

Sharded and Distributed Are Not the Same: What You Must Know When PostgreSQL Is Not Enough

  • In Russian
Presentation pdf

It is well known that PostgreSQL is extremely efficient and scales vertically well. At the same time, it’s not a secret that PostgreSQL scales only vertically, thus its performance is limited by the capabilities of a single server. Most Citus-like solutions allow to shard the database, but a sharded database is not distributed and does not provide ACID guarantees for distributed transactions. The common opinion about distributed DBMSs is diametrically opposed: they are believed to scale well horizontally, have ACID distributed transactions, but lower efficiency in smaller installations.

When comparing monolithic and distributed DBMSs, discussions often focus on architecture but rarely provide specific performance metrics. This talk, on the other hand, is entirely based on an empirical study of this issue. Our approach is simple: we installed PostgreSQL and distributed DBMSs on identical clusters of three physical servers and compared them using the popular TPC-C benchmark.

Given that no universal, bulletproof PostgreSQL configuration exists, we explore various options. Starting with the most performant and least reliable configuration without Write-Ahead Loggin (WAL) and replication, we gradually move towards less performant but more reliable setups, including two synchronous replicas. Throughout this process, we monitor and share PostgreSQL hardware metrics under load, identifying bottlenecks and fine-tuning the system. This allowed us to fairly compare PostgreSQL with distributed DBMSs: CockroachDB and YDB, and answer the question of when PostgreSQL becomes insufficient and how to address it.

Speakers

Invited experts

Schedule