Hadoop infrastructure is a popular solution for such tasks as distributed data storage and calculation of Map/Reduce on cluster. High scalability and developed ecosystem make Hadoop one of the most popular frameworks in the infrastructure of various information systems. But the more responsibility you put on it, the more important it becomes to provide its fault tolerance and high availability. This talk will be useful to those, who have already been using Hadoop. For the rest it will be interesting to learn some architectural solutions used in Hadoop. In this talk we will discuss how to provide high availability of Hadoop cluster components. Besides, we will talk on:
- “the zoo” we deal with;
- why we should provide high availability: points of system failure and its consequences;
- tools and solutions to such problems;
- our practical implementation experience: preparation, deploy, testing.