Created on 12-30-2019 08:54 PM - last edited on 12-31-2019 01:09 AM by VidyaSargur
Hi,
I have setup YARN Fair-scheduler in Ambari (HDP 3.1.0.0-78) for "Default" queue itself. So far, I haven't added any new queues.
Now, I want to run a simple job against the queue and when I submit the job, the application state is in "ACCEPTED" state forever. I get the below message in YARN logs:
The additional information is given below. Please help me in fixing this issue at the earliest.
For "default" queue, the below parameters are set through "fair-scheduler.xml".
Also, no jobs are currently running, apart from the one that I have launched.
Given below is the screenshot, which confirms that the maximum AM resource percent is greater than 0.1
Created 01-02-2020 02:01 PM
Created 01-02-2020 09:20 PM
Hi @EricL,
Thanks for your inputs.
The value of yarn.app.mapreduce.am.resource.mb is set to 1024 in "mapred-site.xml" file.
I was not able to find the value of "yarn.app.mapreduce.am.resource.cpu-vcores" in any of the XML files (i.e core-site.xml, mapred-site.xml, yarn-site.xml, capacity-scheduler.xml etc..)
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>1024</value>
</property>
Here is the progress that I have made so far:
- After setting the Yarn fair scheduler, I did set the Spark program to use Fair scheduling pool also (from default Spark fair scheduler XML template)
- The minimum and maximum allocation (in MB) for Fair Scheduler in Yarn is set to 1024 MB and 3072 MB respectively.
- After running a single Spark job [with both Driver and Executor memory set to 512MB], I was able to verify that the job is running. But, it was consuming the entire 3 GB memory.
- So, the next Spark job is not running at all, as it is waiting for the memory.
- But, if I revert back the YARN scheduling to "Capacity Scheduler", then with the same memory settings, both the jobs are running fine without any issues.
So, what additional memory related parameters need to be set in Fair Scheduling for the jobs to run properly?
Please help me in fixing this issue.
Created 01-05-2020 09:26 AM
Hi @EricL ,
This is just a gentle reminder.
Can you please help me in fixing this issue?
Thanks and Regards,
Sudhindra
Created 01-07-2020 11:05 PM
Hi @EricL ,
I am still facing the same issue when I use YARN fair scheduler to run the Spark jobs.
With the same memory configuration, the Spark jobs are running fine when YARN Capacity Scheduler is used.
Can you please help me in fixing this issue?
Thanks and Regards,
Sudhindra
Created 01-09-2020 03:00 AM
Created 01-09-2020 08:44 PM
Hi @EricL ,
I did change the parameter "yarn.app.mapreduce.am.resource.mb" to 2 GB (2048 MB).
Although the second Spark job is now running fine under "Fair Scheduler" configuration, the tasks under the second Spark job are not getting the required number of resources at all.
[Stage 0:> (0 + 0) / 1]20/01/09 22:58:01 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/01/09 22:58:16 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
20/01/09 22:58:31 WARN YarnScheduler: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
Here are the important information about the cluster:
1. Number of nodes in the Cluster: 2
2. Total amount of memory of Cluster: 15.28 GB
(yarn.nodemanager.resource.memory-mb = 7821 MB
yarn.app.mapreduce.am.resource.mb = 2048 MB
yarn.scheduler.minimum-allocation-mb = 1024 MB
yarn.scheduler.maximum-allocation-mb = 3072 MB)
3. Number of executors set through the program: 5 (spark.num.executors)
4. Number of cores set through the program: 3 (spark.executor.cores)
5. Spark Driver Memory and Spark Executor Memory: 2g each
Please help me in understanding what else is going wrong.
Note: With the same set of parameters (along with yarn.app.mapreduce.am.resource.mb of 1024 MB), the Spark job run fine when YARN Capacity Scheduler is set. However, it doesn't run when YARN Fair Scheduler is set. So, I want to understand what's going wrong only with Fair Scheduler.
Created 01-09-2020 08:56 PM
Created 01-16-2020 10:26 PM
Hi @lyubomirangelo and @EricL ,
Sorry for the delayed response. Thanks for your inputs.
I have already changed the number of vcores. But, I am still facing the same issue.
In the meantime, I was able to execute the jobs with YARN Capacity scheduler (with the same memory configuration). So, I am not sure what's wrong with the settings of YARN Fair Scheduler.
Please suggest if any specific settings are required for YARN Fair Scheduler.
Also, I am still using default queue. I haven't set a separate Queue for handling fair scheduler.
Thanks and Regards,
Sudhindra
Created on 01-17-2020 01:59 AM - edited 01-17-2020 02:05 AM
Hi Sudhinra,
Thank you for the update.
Can you share the SparkConf you use for your applications;
The following settings should work for small resource apps (Note dynamic allocation is disabled):
conf = (SparkConf().setAppName("simple") .set("spark.shuffle.service.enabled", "false") .set("spark.dynamicAllocation.enabled", "false") .set("spark.cores.max", "1") .set("spark.executor.instances","2") .set("spark.executor.memory","200m") .set("spark.executor.cores","1")
From:
PS: Share the number of cores available on your nodes, spark.executor.cores should not be higher than number of cores available on each node. Also, are you running spark in cluster or client mode?
HTH
Best,
Lyubomir