Support Questions

vikakumar · ‎09-17-2018

In the Tez mode of execution of Informatica Mappings on HDP 2.6 cluster, I observed that the property “mapreduce.job.maps” and “mapreduce.job.reduces” present in the Configuration of a job run on Hive in Tez mode fetches the wrong values as compared to the one in Mapreduce mode.

For a large set of data,

In MR mode the values are,

mapreduce.job.maps: 3
mapreduce.job.reduces: 0

While for Tez it is,

mapreduce.job.maps: 2
mapreduce.job.reduces: 6

But the DAG Graphical view shows that there are 3 mappers.

There is a discrepancy in the value of “mapreduce.job.reduces” property in the Tez UI as well.

We are unable to find an equivalent property in the Tez configurations that is correctly populated with the number of mappers and reducers.

ssubhas · ‎09-17-2018

@Vikash Kumar

The properties 'mapreduce.job.*' are only applicable to MR jobs. In Tez, the number of mappers and controlled by below parameters:

tez.grouping.max-size(default 1073741824 which is 1GB)
tez.grouping.min-size(default 52428800 which is 50MB)
tez.grouping.split-count(not set by default)

And, reducers are controlled in Hive with properties:

hive.exec.reducers.bytes.per.reducer(default 256000000)
hive.exec.reducers.max(default 1009)
hive.tez.auto.reducer.parallelism(default false)

For more details, refer link.

Cloudera Community

Support Questions

Incorrect value of number of mappers and reducers in Tez mode

Hive on Tez Performance Tuning - Determining Reduc...

Setting Mapper memory for pig in tez mode

How are number of mappers determined for a query w...

Hive queries use only mappers or only reducers

Understanding Tez Application submission and its f...

Distribution of key,value in mappers and Reducers

Identify number of Mappers & Reducers launched in ...

Reducing Cloud Spend: Cost Strategies for Cloudera...

Demystify Apache Tez Memory Tuning - Step by Step

Hive - tez , vertex failed error during reduce ph...