Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Incorrect value of number of mappers and reducers in Tez mode

Incorrect value of number of mappers and reducers in Tez mode

New Contributor

In the Tez mode of execution of Informatica Mappings on HDP 2.6 cluster, I observed that the property “mapreduce.job.maps” and “mapreduce.job.reduces” present in the Configuration of a job run on Hive in Tez mode fetches the wrong values as compared to the one in Mapreduce mode.

For a large set of data,

In MR mode the values are,

mapreduce.job.maps: 3
mapreduce.job.reduces: 0

While for Tez it is,

mapreduce.job.maps: 2
mapreduce.job.reduces: 6

But the DAG Graphical view shows that there are 3 mappers.

92466-dag.png

92465-tez-ui.png

There is a discrepancy in the value of “mapreduce.job.reduces” property in the Tez UI as well.

92467-tez-ui.png

We are unable to find an equivalent property in the Tez configurations that is correctly populated with the number of mappers and reducers.

1 REPLY 1

Re: Incorrect value of number of mappers and reducers in Tez mode

@Vikash Kumar

The properties 'mapreduce.job.*' are only applicable to MR jobs. In Tez, the number of mappers and controlled by below parameters:

  • tez.grouping.max-size(default 1073741824 which is 1GB)
  • tez.grouping.min-size(default 52428800 which is 50MB)
  • tez.grouping.split-count(not set by default)

And, reducers are controlled in Hive with properties:

  • hive.exec.reducers.bytes.per.reducer(default 256000000)
  • hive.exec.reducers.max(default 1009)
  • hive.tez.auto.reducer.parallelism(default false)

For more details, refer link.

Don't have an account?
Coming from Hortonworks? Activate your account here