Talk

ML Inference Neural Network Services in Yandex Advertising

  • In Russian
Presentation pdf

How to create efficient neural network inference services on the scale of tens of thousands of cores and hundreds of GPUs for a dozen of customers.

The talk is aimed at those who: are engaged in MLOps, ML Inference; are interested in how inference services look like in Yandex Advertising; have built large systems of services that are constrained by CPU and mem; like to develop their services in C++ and invest in efficiency and optimisations.

Speakers

Invited experts

Schedule