Dmitrii Ulianin
Company: Yandex
How to create efficient neural network inference services on the scale of tens of thousands of cores and hundreds of GPUs for a dozen of customers.
The talk is aimed at those who: are engaged in MLOps, ML Inference; are interested in how inference services look like in Yandex Advertising; have built large systems of services that are constrained by CPU and mem; like to develop their services in C++ and invest in efficiency and optimisations.
Company: Yandex
Company: T-Bank