Daria Barsukova
Postgres Pro
Vector search is no longer exotic — RAG systems, semantic search, recommendations — all of this works with embeddings. In PostgreSQL, the go-to extension for this is pgvector. But when your data grows to hundreds of thousands or millions of vectors, a question arises: how does it work under the hood?
In this talk, I'll cover not just the types of indexes available, but also the internals of pgvector. We'll look at how vectors are physically stored in PostgreSQL, what distinguishes HNSW from IVFFlat at the level of data structures and algorithms, where precision loss comes from, and how index parameters affect it.
There will also be a discussion of load testing as a tool for validating hypotheses. I'll show at what data volumes performance starts to degrade, why filtering by metadata can kill search speed, and how pgvector behaves under concurrent load.
Finally, we'll formulate practical recommendations: for which scenarios pgvector is a suitable solution, and where its use still carries risks. This talk will be useful for developers who are already experimenting with pgvector or are just considering it and want to understand its real-world limits.
Postgres Pro