Support Questions

Find answers, ask questions, and share your expertise

Incorrect value of number of mappers and reducers in Tez mode

avatar
Explorer

In the Tez mode of execution of Informatica Mappings on HDP 2.6 cluster, I observed that the property “mapreduce.job.maps” and “mapreduce.job.reduces” present in the Configuration of a job run on Hive in Tez mode fetches the wrong values as compared to the one in Mapreduce mode.

For a large set of data,

In MR mode the values are,

mapreduce.job.maps: 3
mapreduce.job.reduces: 0

While for Tez it is,

mapreduce.job.maps: 2
mapreduce.job.reduces: 6

But the DAG Graphical view shows that there are 3 mappers.

92466-dag.png

92465-tez-ui.png

There is a discrepancy in the value of “mapreduce.job.reduces” property in the Tez UI as well.

92467-tez-ui.png

We are unable to find an equivalent property in the Tez configurations that is correctly populated with the number of mappers and reducers.

1 REPLY 1

avatar

@Vikash Kumar

The properties 'mapreduce.job.*' are only applicable to MR jobs. In Tez, the number of mappers and controlled by below parameters:

  • tez.grouping.max-size(default 1073741824 which is 1GB)
  • tez.grouping.min-size(default 52428800 which is 50MB)
  • tez.grouping.split-count(not set by default)

And, reducers are controlled in Hive with properties:

  • hive.exec.reducers.bytes.per.reducer(default 256000000)
  • hive.exec.reducers.max(default 1009)
  • hive.tez.auto.reducer.parallelism(default false)

For more details, refer link.