Support Questions

hosako · ‎10-28-2015

If a cluster needs to run 100 Oozie workflow concurrently, is there any formula to estimate oozie_heapsize?

Or is there any internal/external best practice document mentioning about heap size?

nsabharwal · ‎10-28-2015

@hosako@hortonworks.com

I found this very helpful

Oozie launcher is just another MapReduce job, any configuration you can set for any MapReduce job is valid for the launcher. But the most relevant and useful ones are usually the memory and queue setting (mapreduce.map.memory.mb and mapreduce.job.queuename). The way to set these for the launcher in an Oozie workflow action is to prefix “oozie.launcher” to the setting. For example, oozie.launcher.mapreduce.map.memory.mb will control the memory for the launcher mapper itself as opposed to just mapreduce.map.memory.mb which will only influence the memory setting for the underlying MapReduce job that the Hadoop, Hive, or Pig action runs. So, if you have a Hive query which requires you to increase the client side heap size when you submit the query using the Hive CLI, remember to increase the launcher mapper’s memory when you define the Oozie action for it.

View solution in original post

nsabharwal · ‎10-28-2015

@hosako@hortonworks.com

I found this very helpful

Oozie launcher is just another MapReduce job, any configuration you can set for any MapReduce job is valid for the launcher. But the most relevant and useful ones are usually the memory and queue setting (mapreduce.map.memory.mb and mapreduce.job.queuename). The way to set these for the launcher in an Oozie workflow action is to prefix “oozie.launcher” to the setting. For example, oozie.launcher.mapreduce.map.memory.mb will control the memory for the launcher mapper itself as opposed to just mapreduce.map.memory.mb which will only influence the memory setting for the underlying MapReduce job that the Hadoop, Hive, or Pig action runs. So, if you have a Hive query which requires you to increase the client side heap size when you submit the query using the Hive CLI, remember to increase the launcher mapper’s memory when you define the Oozie action for it.

hosako · ‎10-28-2015

Thanks! Does this mean Oozie's Tomcat heap size would not be important to run 100 concurrentworkflow?

nsabharwal · ‎10-28-2015

If I am in your shoes then I will be focusing on this thread

Cloudera Community

Support Questions

What is the Oozie Heap size recommendation for production?