Community Articles

sinha_pranshu · ‎02-15-2016

There are certain times where we need to change the priority of the hadoop jobs. Due to some business criticality, we want some jobs to have high priority and some jobs to have low priority. So, that the important jobs are completed early.

If Hadoop cluster is using the Capacity Scheduler with priorities enabled for queues, then we can set priority of our hadoop jobs. This article explain to set the priority of hadoop jobs and explained how to change the priority of Hadoop Jobs.

1)Set the priority in Map Reduce Program: In Map/Reduce program we can set the job priority using following way.

Configuration conf = new Configuration();

// set the priority to VERY_HIGH

conf.set("mapred.job.priority", JobPriority.VERY_HIGH.toString());

Allowed priority values are:VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW

2)Set the priority in Pig Program: We can set priority of Pig job using below property, This property is used to set the job priority is Pig Programming :

job.priority

For example:

grunt> SET job.priority 'high'

If you are trying to set priority in Pig Script then write this property before load statement

For example:

SET job.priority 'high';

A = LOAD '/user/hdfs/myfile.txt' USING PigStorage() AS (ID, Name);

Acceptable values to set the priority is:very_low, low, normal, high, very_high

Please note these values are case insensitive.

3)Set the priority for Hive Query: In Hive we can set the job priority using below property.

SET mapred.job.priority=VERY_HIGH;

You need to set this value before your query. Allowed priority values are:VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW

Themapred.job.priorityis deprecated. The new property ismapreduce.job.priority

We can also change the priority of the running hadoop jobs.

Usage: hadoop job -set-priority job-id priority

For example:

hadoop job -set-priority job_20120111540_54485 VERY_HIGH

Allowed priority values are:VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW

Cloudera Community

Community Articles

Priority of a Hadoop job

Apache Hadoop