Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Setting Mapper memory for pig in tez mode

avatar
Expert Contributor

I am running my pig scripts and Hive queries in tez mode. For all of these pig scripts/Hive queries the mapper memory requested was more than the memory used. So I changed the mapreduce.map.memory.mb to a lesser value and also changed the mapreduce.map.java.opts.

Even after changing these values, my mapper memory requested is more than the map memory used, nothing seemed to changed in performance metrics. (This was from analyzing the job in Dr. elephant), but then the pig script also aborted now with below error message after changing these settings.

"java.lang.IllegalArgumentException: tez.runtime.io.sort.mb 1638 should be larger than 0 and should be less than the available task memory (MB):786"

I never gave 786 MB anywhere in my setting, where did it take this value from?

And also, how do I configure the map and reduce memory in tez execution mode? (I see documentation for hive to set then hive.tez.container.size, but nothing for pig).

And is it possible to configure the map and reduce memory differently in tez mode? since in hive on tez documentation it was just mentioned about the map memory setting nothing for reducer memory. And also since tez creates a dag of tasks, they are not like map reduce right, both map and reduce are just seen as an individual task in DAG? or are these DAG tasks still can be classified into mapper/reducer actions?

Thanks!

1 ACCEPTED SOLUTION

avatar
Guru

@R M

What is the value of the following properties:

tez.am.resource.memory.mb

tez.task.resource.memory.mb

Have you tried playing around with the same since you are using Tez mode ?

View solution in original post

3 REPLIES 3

avatar
Guru

@R M

What is the value of the following properties:

tez.am.resource.memory.mb

tez.task.resource.memory.mb

Have you tried playing around with the same since you are using Tez mode ?

avatar
Expert Contributor

Thanks for the suggestion. I have not tried these parameters.. What are these parameters for? Are these the ones that help set the mapper memory size in pig?

avatar
Guru

Since you running pig while hive.execution.engine is in tez mode, you can tune these parameters OR set the upper limit in hive-env either ways you should be able to control how much memory is allocated for your job.

This community article explains the ideal values in detail:

https://community.hortonworks.com/articles/14309/demystify-tez-tuning-step-by-step.html

In short:

  • Set the values for tez.am.resource.memory.mb equivalent to yarn.scheduler.minimum-allocation-mb

Try that and see if that helps.