Talk

How We Migrated from PostgreSQL to Data Lake at AWS

  • In Russian
Presentation pdf

Whoosh works on the AWS stack — PostgreSQL, S3, Redshift, and they build all data models in dbt and a little Python. This year for the data engineering team could be called the year of the move. They had the global idea of moving from one repository (PostgreSQL) — and that includes all reporting for the business and dbt models — to Data Lake rails. The goal was to optimize costs, because Aurora (PostgreSQL) writes to the expense of every query, while Redshift is an MPP columnar database whose costs are constant - n$/hour (and it runs faster, yes). However, due to the move it turned out that for geo-tasking this solution is not suitable: Redshift is based on version 8 of Postgres (suddenly!) which doesn't have a well-supported work with geometry, cuts off cell values over a certain length and is not at all friendly with JSON keys.

  • #migration
  • #postgres
  • #aws
  • #redshift

Speakers

Schedule