Types of Schedulers in Hadoop

The YARN in Hadoop, is responsible for resource allocation. Resource Manager in the master daemon tracks and assigns resources required for any application within the Hadoop architecture. Schedulers and Application Managers are important components of the Resource Manager. The scheduler as the name suggests schedules processing of jobs. Schedulers are based on algorithms that place jobs submitted in a queue and execute them in a particular order.

There are three types of schedulers:

-- FIFO Scheduler: The First in First out scheduler. In FIFO, the tasks are placed in a queue and executed in the order in which they are submitted. No matter the priority of a task, it has to wait till it is its turn. This is the default scheduler in Hadoop and it is easy to execute.

-- Capacity Scheduler: There are multiple job queues in this scheduler, having cluster resources assigned to it. The jobs are executed as they come to the queue. Once the resources of a queue are free, jobs can be executed in it. It is good for multiple priority jobs that need to be executed.

-- Fair Scheduler: The priority of the job is taken into consideration when it is submitted. The resources of the Hadoop cluster are dynamically maintained. When new jobs are submitted they are executed by reassigning some of the resources from jobs that are running for long. Jobs don’t have to wait if they are of high priority. It is more complex than FIFO to execute, requiring configuration.

The different schedulers are important for the optimum usage of resources in the Hadoop cluster. With multiple clients, priority jobs and resource requirements, it is important to use the appropriate scheduler. Fair scheduler is the best option in most cases, to maximize capacity and prioritize tasks according to their importance.

Screen Shot 2021-09-30 at 12.07.00 AM.png
Rema Shivakumar- CuriouSTEM Staff

CuriouSTEM Content Director - Computer Science

Previous
Previous

The Science Behind Everest People

Next
Next

Data Locality in Hadoop