From click to predict and back: Data Science pipelines at OK

День 1 / Зал 2 / RU

Machine Learning is a fun, but in order to make it works in industry a lot of boring stuff needs to be done. In our talk we will consider all the technologies, algorithms and methods needed to make you ML shine like a diamond in a beautiful setting.

As an example we will consider a single, but complex task – news feed personalization. Without diving into ML details, we will talk about real-time and batched data collection, ETL and processing needed to get the model.

But getting the model is not enough, thus we also discuss the way how to get the model predictions in a complex high-load distributed environment and use them for decision making.

The talk covers different data processing and storage technologies from Hadoop ecosystem and more. If you are doing ML not just for fun, but also for profit, you might get some useful information from the talk.

