Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive limit number of mappers and reducers

Highlighted

Hive limit number of mappers and reducers

Contributor

I am running a hive which moving data from one table to another table.

 

first table number of splitted files in hdfs  --> 12 files.

second table number of splitted files in hdfs --> 17 files.

 

for second table each file have size of 870 mb

 

i have setted this property in the hive to hive import statement.

 

set mapreduce.input.fileinputformat.split.maxsize=858993459;
set mapreduce.input.fileinputformat.split.minsize=858993459;

 

and when querying the second table it takes

 

51 mappers

and 211 reducers.

 

and occupied whole yarn resources.

 

I want to restrict the number of mappers and reducers for the hive query.

 

 

Please help me to solve it.

 

2 REPLIES 2

Re: Hive limit number of mappers and reducers

Champion

@ganeshkumarj

 

a. mapred.map.tasks - The default number of map tasks per job is 2. Ignored when mapred.job.tracker is "local". You can modify using set mapred.map.tasks = <value>


b. mapred.reduce.tasks - The default number of reduce tasks per job is 1. Typically set to 99% of the cluster's reduce capacity, so that if a node fails the reduces can still be executed in a single wave. Ignored when mapred.job.tracker is "local". you can modify using set mapred.reduce.tasks = <value>

 

https://hadoop.apache.org/docs/r1.0.4/mapred-default.html

Re: Hive limit number of mappers and reducers

Super Collaborator

Alternatively you could search around "yarn queue" and ressource allocation.

 

This will not "restrict" the number of mappers or reducers but this will control how many can run concurrently by giving access to only a subset of the available resources.