SmartData talks

Ivan Yamschikov The Max Planck Institute, Leipzig, Germany / Creaited Labs
Ivan Yamschikov
The Max Planck Institute, Leipzig, Germany / Creaited Labs
Day 1 / 17:50  / Track 1 / RU / Введение в технологию

Neurona: why we taught a neural network to write poems in Kurt Cobain style

We'll discuss modern tasks in developing a creative AI and why it is important and interesting, along with sharing our experience of creating Neurona, Neuronnaya Oborona and Pianola.

Read more
Dmitry Bugaychenko  Odnoklassniki
Dmitry Bugaychenko
Odnoklassniki
Day 1 / 14:25  / Track 1 / RU / Для практикующих инженеров

From click to predict and back: Data Science pipelines at Odnoklassniki

We will consider one complex task of news feed personalization and talk about getting a model and model predictions, along with different data processing and storage technologies from Hadoop ecosystem.

Read more
Artem Marinov Directual
Artem Marinov
Directual
Day 1 / 15:35  / Track 1 / RU / Для практикующих инженеров

Segmenting 600 millions of users in real-time mode every day

How we've changed the architecture of the platform to process in real-time 600 million users' data (hundreds of thousands of events per second), which problems we've faced and how we've dealt with them.

Read more
Sergey Nikolenko PDMI RAS
Sergey Nikolenko
PDMI RAS
Day 1 / 12:50  / Track 1 / RU / Для практикующих инженеров

Deep convolutional networks for object detection and image segmentation

In this talk we'll discuss how CNNs have evolved from classifying individual objects to detecting multiple objects on a picture. We'll dwell on YoLo, single-shot detectors, and a range of models from R-CNN till Mask R-CNN.

Read more
Alexey Potapov ITMO
Alexey Potapov
ITMO
Day 1 / 11:40  / Track 2 / RU / Хардкор. Сложный низкоуровневый доклад, требующий от слушателя знаний технологии.

Deep learning, probabilistic programming and meta-estimation: point of intersection

Discussing generative and discriminative models' connections in terms of program specialization, their role within the framework of deep learning and probabilistic programming. Addressing neural Bayesian approach, neural probabilistic programming as an integration of two paradigms on an example of Edward library.

Read more
Vitaly Khudobakhshov  Odnoklassniki
Vitaly Khudobakhshov
Odnoklassniki
Day 1 / 10:30  / Track 1 / RU / Введение в технологию

Name is a feature

We'll talk about the most unexpected and counterintuitive observations you can do using data analysis in social networks, along with the statistical significance of such observations, bot influence, and false correlations.

Read more
Artem Grigoriev Yandex
Artem Grigoriev
Yandex
Day 1 / 11:40  / Track 3 / RU / Введение в технологию

Crowdsourcing: How to train your crowd

The talk encouraged by the experience of the Yandex.Toloka crowdsourcing platform creation and use shares clues on the quality control, motivation of performers and various models of aggregating separate judgements.

Read more
Anna Veronika Dorogush Yandex
Anna Veronika Dorogush
Yandex
Day 1 / 15:35  / Track 3 / RU / Для практикующих инженеров

CatBoost — the next generation of gradient boosting

How to effectively use an open-source algorithm of CatBoost gradient boosting, who benefits from it now, where it will be used and who should pay attention to it.

Read more
Andrey Boyarov Mail.Ru Group
Andrey Boyarov
Mail.Ru Group
Day 1 / 14:25  / Track 3 / RU / Для практикующих инженеров

Deep Learning: Scene recognition and attractions recognition in the images

In this talk we will discuss the development of a system for solving the problem of scene recognition with the help of a state-of-the-art approach based on deep convolutional neural networks.

Read more
Vladimir Krasilschik Yandex
Vladimir Krasilschik
Yandex
Day 1 / 14:25  / Track 2 / RU / Введение в технологию

Back to the future of a modern banking system

This talk will cover Audit-Driven Development and its origin, how to organize a bitemporal database of facts, why a built-in time machine must be in every modern distributed system. Also, we'll share a "universal formula of fact" and what the tasks of so called "analytics" most frequently appear to be.

Read more
Alexander Serbul 1C-Bitrix
Alexander Serbul
1C-Bitrix
Day 1 / 12:50  / Track 2 / RU / Введение в технологию

Applied machine learning in e-commerce — scenarios and architectures of pilots and real-world projects

Presenting a number of company's pilots and real-world projects applying popular and "rare" algorithms of machine learning, as well as technical implementation on different platforms such as Java, PHP, Python using open libraries and a range of Amazon Web Services tools.

Read more
Ivan Drokin BrainGarden
Ivan Drokin
BrainGarden
Day 1 / 15:35  / Track 2 / RU / Введение в технологию

No data? No problem! Deep Learning with CGI

We will consider an example of training deep convolutional networks for object keypoints localization on a fully synthetic data set.

Read more
Alexander Sibiryakov Scrapinghub
Alexander Sibiryakov
Scrapinghub
Day 1 / 16:45  / Track 3 / RU / Введение в технологию

Automatic contact information extraction from the web

This talk is about a distributed web-crawler for search and extraction of contact information from corporate websites.

Read more
Alexander Krasheninnikov Badoo
Alexander Krasheninnikov
Badoo
Day 1 / 16:45  / Track 1 / RU / Для практикующих инженеров

Hadoop high availability: Badoo experience

We will discuss how to provide high availability of Hadoop cluster components and why we need it.

Read more
Alexey Natekin DM Labs, Arktur, Open Data Science
Alexey Natekin
DM Labs, Arktur, Open Data Science
Day 1 / 16:45  / Track 2 / RU / Для практикующих инженеров

Lock, stock and two boosting barrels

We'll try to figure out if we need a video card in 2017-2018 for gradient boosting learning.

Read more
Boris Shminke ivi
Boris Shminke
ivi
Day 1 / 12:50  / Track 3 / RU / Для практикующих инженеров

Distributed ML on Big Data: recommender system building experience at ivi

How we walked a path from a single matrix multiplying script to a Hadoop Data Lake of our own, distributed Machine Learning with Spark, offline evaluation framework in Scala and, as a result, a Recommender System highly tunable to business needs.

Read more
Mikhail Kamalov Epam Systems
Mikhail Kamalov
Epam Systems
Day 1 / 11:40  / Track 1 / RU / Для практикующих инженеров

Recommendation systems: from matrix factorization to deep learning in stream mode

The talk will cover the practical aspects of using deep learning, collaborative and content filtering and filtering by time, as well as approaches for recommendation systems. In addition, the construction of hybrid advisory systems and modifications of approaches for online learning on Spark will be considered.

Read more