Segmenting 600 millions of users in real-time mode every day

День 1 /  / Зал 3  /  RU /

Every day users perform millions of actions on the Internet. Project FACETz DMP has to structure these data and make segmentation in order to reveal users' preferences. We'll tell you how, by using Kafka and HBase, we manage to:

  • segment 600 millions of users after the switch from MapReduce to Realtime and how we did it;
  • process 5 billions of events every day;
  • keep statistics on the number of unique users in a segment during the stream processing;
  • monitor the influence of segmentation parameters changes.