SmartData talks

Anna Veronika Dorogush Yandex
Anna Veronika Dorogush
Day 1 / 11:40  / Track 3 / / Для практикующих инженеров

CatBoost — gradient boosting training on the large data volumes

In this session we are going to talk briefly about the meaning and functions of the gradient boosting, cover the library's main features, and dwell on boosting training on the large data volumes.

Read more
Aleksandr Tobol Odnoklassniki
Aleksandr Tobol
Day 1 / 11:40  / Track 1 / / Для практикующих инженеров

Recognizing 330 million faces at a speed of 1500 photos/sec

We'll look at the pipeline for building users vectors and users search on the uploaded photos; neural network learning; facial detector on neural networks cascade and its optimization; building rescaled user vector on GPU; hardware and optimizations, launch in the cloud, fault tolerance.

Read more
Ivan Yamshchikov ABBYY
Ivan Yamshchikov
Day 1 / 11:40  / Track 2 / / Введение в технологию

Machine learning and the two titanium marbles

We'll discuss how machine learning in the harsh reality of enterprise differs from the one in B2C, see if it's possible to build AI solutions in the context of data deficiency, and talk about best ML practices in production using ABBYY products as an example.

Read more
Jerome Bellegarda Apple
Jerome Bellegarda
Day 1 / 10:30  / Track 1 / / Введение в технологию

The deep learning revolution

Jerome will illustrate how the present deep learning revolution is changing the way we interact with technology in our daily lives, address the central question of privacy breach, and finally discuss how to alleviate the inherent tension between leveraging users' data and maintaining data privacy.

Read more
Roman Nozdrin MariaDB Corporation
Roman Nozdrin
MariaDB Corporation
Day 1 / 12:50  / Track 2 / / Для практикующих инженеров

Change data capture from MariaDB and PostgreSQL to the analytical engine MariaDB Columnstore

In this talk we'll discuss and demonstrate CDC methods from the basic open source DBMS — MariaDB and PostgreSQL. For MariaDB we are going to use its native solution MaxScale, and for PostgreSQL — Kafka, Debezeum, and ColumnStore write API stack.

Read more
Viktor Gamov   Confluent
Viktor Gamov
Day 1 / 14:40  / Track 3 / /

Crossing the streams: rethinking stream processing with KStreams and KSQL

Viktor Gamov will introduce Kafka Streams and KSQL — an important recent addition to the Confluent open source platform that lets us build sophisticated stream processing systems with little to no code at all!

Read more
Ilia Larchenko DOC+
Ilia Larchenko
Day 1 / 14:40  / Track 1 / / Для практикующих инженеров

ML for doctors' work optimisation

DS in primary health care: chat-bot symptom-checker and medical quality control system.

Read more
Аlexey Milovidov Yandex
Аlexey Milovidov
Day 1 / 15:50  / Track 3 / / Для практикующих инженеров

Obfuscating databases

Changed or artificial data sets, as similar to real as possible, can be used for performance testing, algorithms debugging and machine learning. The development of ClickHouse requires data sets that give approximation for the data of Yandex.Metrica. Alexey will tell about four different approaches they tried to solve this problem, which one eventually wins and how you can use it.

Read more
Аnton Slesarev Yandex
Аnton Slesarev
Day 1 / 14:40  / Track 2 / /

Title will be announced soon

Description will be announced soon

Read more
Dmitry Solomentsev Yandex
Dmitry Solomentsev
Day 1 / 12:50  / Track 3 / /

Title will be announced soon

Description will be announced soon

Read more
Dmitry Goryunov Zalando SE
Dmitry Goryunov
Zalando SE
Day 1 / 17:00  / Track 3 / /

Organizing access to Zalando's Data Lake

Dmitry'd like to make a retrospective on development of a data lake in one of the Europe's biggest ecommerce companies. The topics covered are revolving around organizing access to unorganized data. The talk recaps Dmitry and his team's experience with access management, metadata management, execution engines, visualization tools, data governance, machine learning enablement.

Read more