Sergey Zhemzhitsky
Company: Arenadata
The talk is devoted to data sketches, or streaming algorithms (HyperLogLog, CPC, Theta, Count-min, FDT, KLL, etc.) for data processing and analysis, which are designed to solve a range of issues like counting unique elements, finding elements distribution, determining elements frequency, etc.
In most of the mentioned cases, obtaining accurate results requires significant computational and time resources. In case approximate results are acceptable, data sketches allow you to obtain the results much faster than traditional solutions, because in the case of batch data processing there may often be no alternatives, while in the case of stream data processing sketches usually are the only viable way of getting a result.
Company: Arenadata
Company: Positive Technologies