Reply
Contributor
Posts: 43
Registered: ‎01-19-2017

Hive limit number of mappers and reducers

[ Edited ]

I am running a hive which moving data from one table to another table.

 

first table number of splitted files in hdfs  --> 12 files.

second table number of splitted files in hdfs --> 17 files.

 

for second table each file have size of 870 mb

 

i have setted this property in the hive to hive import statement.

 

set mapreduce.input.fileinputformat.split.maxsize=858993459;
set mapreduce.input.fileinputformat.split.minsize=858993459;

 

and when querying the second table it takes

 

51 mappers

and 211 reducers.

 

and occupied whole yarn resources.

 

I want to restrict the number of mappers and reducers for the hive query.

 

 

Please help me to solve it.

 

Posts: 388
Topics: 11
Kudos: 60
Solutions: 34
Registered: ‎09-02-2016

Re: Hive limit number of mappers and reducers

@ganeshkumarj

 

a. mapred.map.tasks - The default number of map tasks per job is 2. Ignored when mapred.job.tracker is "local". You can modify using set mapred.map.tasks = <value>


b. mapred.reduce.tasks - The default number of reduce tasks per job is 1. Typically set to 99% of the cluster's reduce capacity, so that if a node fails the reduces can still be executed in a single wave. Ignored when mapred.job.tracker is "local". you can modify using set mapred.reduce.tasks = <value>

 

https://hadoop.apache.org/docs/r1.0.4/mapred-default.html

Highlighted
Posts: 173
Topics: 8
Kudos: 19
Solutions: 19
Registered: ‎07-16-2015

Re: Hive limit number of mappers and reducers

[ Edited ]

Alternatively you could search around "yarn queue" and ressource allocation.

 

This will not "restrict" the number of mappers or reducers but this will control how many can run concurrently by giving access to only a subset of the available resources.

 

 

Announcements