Skip to content
SmartData 2020Season: 2020
  • Talks
  • Speakers
  • Partners
  • About
  • Archive
  • New SmartData
RU
  • New SmartData
RU

Talks

  • Talks
  • Favorites
  • Watch recording

    Highly Normalized Hybrid Model, or how we implemented the storage model

    The DWH structure is not very flexible and modern approaches to design help fix this: Data Vault and Anchorn modeling. Eugene and Nikolay will tell you more about what to choose.

    • Evgeny Ermakov

      Yandex Go

    • Nikolay Grebenshchikov

      Yandex Go

    In RussianRU
  • Watch recording

    Kusto (Azure Data Explorer): Microsoft's interactive Big Data platform

    During this session Alexander will tell what makes Kusto (Azure Data Explorer) different from other solutions, will show how complex analysis of live telemetry of billion of records can take seconds, and open the curtain of the architecture on which Kusto is built.

    • Aleksandr Sloutsky

      Microsoft

    • Gleb Lesnikov

      Dodo Engineering

    In RussianRU
  • Watch recording

    Versioning database structure taking storage as an example

    Vladislav will talk about versioning database structure taking Lamoda storage as an example.

    • Vladislav Shishkov

      Lamoda

    In RussianRU
  • Watch recording

    Data initiation in Nifi

    We will talk about NiFi as ETL and data Initiation for streaming. Bronislav will try to describe some practices and advice that Tinkoff uses.

    • Bronislav Zhitnikov

      Tinkoff

    In RussianRU
  • Watch recording

    Digitizing a worker in real-time

    How does data from wearable devices travel to the user interface of the Digital Worker system.

    • Alexey Konyaev

      CROC

    In RussianRU
  • Watch recording

    Approaches to building a modern data platform. The problems and the concept of implementation

    Alexander will talk about the main characteristics of the modern data platform, the differences in the DWH architecture, the components used, and the open source distribution of Hadoop.

    • Aleksandr Ermakov

      Arenadata

    In RussianRU
  • Watch recording

    Safe interactive big data at the bank: Business intelligence on Clickhouse

    In his talk, Pavel will tell you what caused data fragmentation in his organization, and what typical analytics scenarios suffer as a result. He will also explain why the classic approach did not work for Deutsche Bank and what they learned to do differently.

    • Pavel Yakunin

      Russian Tech Centre Deutsche Bank

    In RussianRU
  • Watch recording

    Flink + Zeppelin: Streaming data analytics platform

    In this talk, Jeff would talk about how to use Flink on Zeppelin to build your own streaming data analytics platform.

    • Jeff Zhang

      Alibaba Group

  • Watch recording

    Enterprise data platform: Data infrastructure as a testing ground for business hypotheses

    The talk about S7's experience in building a data platform, how long it took to build it.

    • Andrey Zhukov

      S7 Techlab

    In RussianRU
  • Watch recording

    The latest and greatest of Delta Lake

    This talk is a gentle introduction to the latest and greatest of Delta Lake. You will learn what Delta Lake is and what challenges it aims to solve.

  • Watch recording

    Writing flexible pipelines for data platforms with Dagster

    How to make Spark + Scala jobs and Python apps friends? Andrey will explain why it's worth doing and how to write pipelines with reusable blocks and flexible architecture using Dagster.

    • Andrey Kuznetsov

      Odnoklassniki

    In RussianRU
  • Watch recording

    Segmentation: A single window of knowledge about a user

    Maria and Olga will present a talk on how to build an analytics system, which significantly expands business opportunities, using JVM and open source technologies.

    • Olga Makarova

      ivi

    • Maria Nosareva

      ivi

    In RussianRU
  • Watch recording

    NeoFS: Storing object data according to your rules

    Stanislav wants to share the example of how you can replace the centralized S3 for storing data with a more accessible solution, organize policies so that data processing becomes more efficient. And also tell why there are multigraphs, homomorphic cryptography, multi-pass games, zero-knowledge proofs, and other mathematics.

    • Stanislav Bogatyrev

      NEO Saint Petersburg Competence Center

    In RussianRU
  • Watch recording

    Review of the big data technologies. Pros and cons

    Maksim's talk is about the pros and cons of various solutions for storing data: Cloud Solutions, Bare Metal Solutions, Hadoop, Vertica, ClickHouse, ExaSol, GreenPlum (ArenaDataDB), RDBMS, Teradata, and other.

    • Maksim Statsenko

      Yandex

    In RussianRU
  • Watch recording

    How we develop DMP for Taxi, Food, and Lavka

    Vladimir will talk about the motivation you need to develop your own ETL tool, about transforming ETL and DWH into DMP. The speaker will share what problems arise during the development of DMP and tell about the experience of solving them.

    • Vladimir Verstov

      Yandex.Go

    In RussianRU
  • Watch recording

    How we built Serverless Spark experience on Kubernetes

    During this session, we'll talk about architecture, why Staroid used Kubernetes, what were the challenges, and how the company solved them. You will also see a working demo so you can get an idea of what the Serverless Spark experience looks like and how it benefits in your work.

    • Moon soo Lee

      Staroid, Inc.

  • Watch recording

    On the way from Kafka to NiFi: How not to break and not lose

    This talk is about building a fell-safe system for an Apache NiFi cluster using Apache Kafka as an input source.

    • Roman Korobeynikov

      VirtualHealth

    In RussianRU
  • Watch recording

    SQL migrations to Postgres under load

    It is not a problem to make table migration when the database is stopped. But what if you need to migrate if the database is working? Nikolay will tell you about this in the form of practical tips for PostgreSQL.

    • Nikolay Averin

      Miro

    In RussianRU
  • Watch recording

    CI/CD for ML models and datasets

    There is not a very high-quality DS model in production and now there is no way to retrain or update it. To avoid this, come and listen to Mikhail's talk on this topic.

    • Mikhail Maryfich

      Mail.Ru Group

    In RussianRU
  • Watch recording

    Working with data at a low level

    Let's talk about some technologies that can help you to take more out of your machine — JIT, BLAS, and parallelism.

    • Nikolay Markov

      Aligned Research Group

    In RussianRU
  • Watch recording

    AI-augmented data preparation: Building technology-agnostic data pipelines for modern data stacks with AI

    Evgeny will talk about modern trends of Modern Data Stack, about pros and cons of old (ETL) and new (ELT) approaches and reasons which led to creating their own DSL.

    • Evgeny Legky

      Retable

    In RussianRU
  • Watch recording

    Scio — data processing at Spotify

    We'll talk about the evolution of big data at Spotify, from Python, Hadoop, Hive, Storm, Scalding to today's world of cloud, and serverless computing.

    • Neville Li

      Spotify

  • Watch recording

    Our repository for web analytics

    Using the example of the history of building a repository for an advanced web analytics service, Artur will tell how the storage and reporting system in his project has evolved over the past 5 years.

    • Artur Hachuyan

      Tazeros

    In RussianRU
  • Watch recording

    Kusto (Azure Data Explorer): Architecture and internals

    The talk about the principles of building a new database from scratch for working with logs and telemetry.

    • Evgeny Rizhik

      Microsoft

    In RussianRU
  • Watch recording

    Stateful streaming: Cases, patterns, implementations

    During this session, we will talk about the popular approach to data processing — thread processing, with a focus on working with the state.

    • Dmitry Bugaychenko

      Sber

    In RussianRU
  • Watch recording

    Kotlin API for Apache Spark: Why we made another API for working with Spark

    Pasha and Vitaliy will talk about what data engineers choose and why they decided to make an API for one of the most popular frameworks for pipelines building.

    • Pasha Finkelstein

      JetBrains

    • Vitaly Khudobakhshov

    In RussianRU
  • Watch recording

    Round Table: Programming languages in Data Engineering

    We'll be discussing a wide variety of languages and technologies that data engineers are currently working with.

    • Vitaliy Bragilevskiy

      JetBrains

    • Pasha Finkelstein

      JetBrains

    • Vitaly Khudobakhshov

    In RussianRU
  • Watch recording

    Demo: Big Data tools

    Join us for a presentation of a new JetBrains product: the Big Data Tools plugin. We will discuss its most significant use cases and provide a short demonstration using real-world examples. All questions will be answered by the developers directly involved in BDT development.

    • Oleg Chirukhin

      JetBrains

    In RussianRU
  • Watch recording

    Conference opening

    Find out what awaits you in the next 4 days. The program committee will talk about schedule, interesting talks, and in what format they will be held. The team of organizers in turn will tell you how our platform works, where discussion zones will be held, how to connect to chat rooms, and where to ask questions.

    • Alexey Fyodorov

      JUG Ru Group

    • Vitaly Khudobakhshov

    In RussianRU
  • Watch recording

    Conference closing

    Join the SmartData closing with the Program committee: we will discuss the most interesting talks and chatters as well as talks that should be returned after the conference.

    • Alexey Fyodorov

      JUG Ru Group

    • Sergey Boytsov

      JetBrains

    In RussianRU
  • Watch recording

    How to master time and space

    Applying MLOps to a high-performance geospatial data platform for the edge and cloud.

    • Phil Laszkowicz

      Futurice

  • Watch recording

    How to master time and space

    Applying MLOps to a high-performance geospatial data platform for the edge and cloud.

    • Phil Laszkowicz

      Futurice

Data Engineering conference

Our conferences
  • Calendar of all conferences
  • BiasConf
  • C++ Russia
  • CargoCult
  • DevOops
  • DotNext
  • Flow
  • GoFunc
  • Heisenbug
  • HolyJS
  • Hydra
  • IML
  • InBetween
  • JPoint
  • Joker
  • Mobius
  • PiterPy
  • SafeCode
  • SmartData
  • TechTrain
  • VideoTech
  • sysconf
Menu
  • New SmartData
  • Talks
  • Speakers
  • Partners
  • About
  • Archive
  • Legal documents

JUG Ru Group

Need help?

  • Phone: +7 (812) 313-27-23
  • Email: support@smartdataconf.ru
  • Telegram: @JUGConfSupport_bot

Social links

  • Youtube
  • X
  • Telegram chat
  • Telegram channel
  • VK
  • Habr
© JUG Ru Group, 2017–2026