In this talk Sharad will talk about how YARN lowers the barrier to do innovation and opens up various possibilities. He will discuss the YARN component architecture, various design choices, touch upon how YARN is designed to scale tens of thousands of machines, fault tolerance and recovery aspects built in to the system.
Hadoop MapReduce in NextGen (MR v2/YARN) has undergone complete runtime re-architecture. The resource management and application lifecycle management are split into different components from the single JobTracker. Hadoop becomes the general purpose data and compute platform, that can support compute paradigms other than MapReduce like Graph processing, MPI etc. MapReduce v2 has got some of the new features and capabilities, which are not present in the classic MapReduce.
The side goals of the new system are scalability, modularity and keeping in mind the backward compatibility with the classic MR runtime features so that the MR applications can seamlessly migrate to the new runtime. In this session I will talk about the component architecture and fundamental changes in the concurrency management to avoid global locks as it exist in JobTracker. The new framework make much effective use of multi-core hardware and is able to scale to tens of thousands of machines with lower job latencies.
The complex state management of various entities like job, task, node, container etc. are handled via declarative state machine. Briefly I will talk about how the lightweight event model coupled with state management comes handy in designing fault tolerance and automatic recovery from failures the new runtime offers.