Created on 11-19-2018 02:24 AM - edited 09-16-2022 06:54 AM
Hello everyone!
I have a typical scenario where there are multiple pipelines running on Oozie, each one with different dependencies and time schedules. These pipelines comprise different jobs like Hive, Spark, Java etc. Many of these jobs are heavy on memory, the cluster has a total of 840 GB of RAM, so let's say that the memory is enough to complete any of these jobs but could not be enough to allow several of these jobs to run and complete at the same time.
Sometimes happen that few of these jobs need to run concurrently, in this case I've noticed a sort of starvation in YARN. None of the jobs continues the execution, there are a lot of heartbeats in the logs, and none eventually completes.
YARN is set to use the Fair Scheduler, I would imagine that in a situation like this it should give resources at least to one of the job but it seems that all the jobs are fighting for resources and YARN is not capable to handle the impasse.
I would like to know which are the best practices to handle these type of scenarios. Do I need to define different YARN queues with different resources/priority (actually all the jobs run on the default queue)?
Created 11-19-2018 02:46 AM
Created 11-20-2018 02:19 AM
Hi @Harsh J, thank you very much for these informations (I am using Oozie server build version: 4.1.0-cdh5.13.2)!
So if I understand correctly I need to add two properties in the oozie actions configuration, one specifying the launcher queue and one specifying the job queue.
Below it is shown a sqoop action where I have added these two properties (in bold):
<action name="DLT01V_VPAXINF_IMPORT_ACTION"> <sqoop xmlns="uri:oozie:sqoop-action:0.2"> <job-tracker>${jobTracker}</job-tracker> <name-node>${nameNode}</name-node> <configuration> <property> <name>oozie.launcher.mapred.job.queue.name</name> <value>oozie_launcher_queue</value> </property> <property> <name>mapred.job.queue.name</name> <value>job_queue</value> </property> <property> <name>oozie.launcher.mapreduce.map.java.opts</name> <value>-Xmx4915m</value> </property> <property> <name>oozie.launcher.mapreduce.reduce.java.opts</name> <value>-Xmx9830m</value> </property> <property> <name>oozie.launcher.yarn.app.mapreduce.am.command-opts</name> <value>-Xmx4915m</value> </property> </configuration> [...] </sqoop> [...] </action>
I have some questions:
Thank you for the support!
Created 11-25-2018 06:34 PM
Created 11-26-2018 01:42 AM
Thank you very much @Harsh J!
If I got it correctly these parameters
control the maximum amount of memory allocated for the Oozie launcher.
What are the equivalent parameters to control the memory allocated for the action instead (e.g. a Sqoop action), as shown in the image?