Created on 11-06-2017 08:14 PM - edited 08-18-2019 12:39 AM
At the moment I have a spark job (java) that will always need to be running. It doesn't need too many resources. However, whenever I run a sqoop job (MapReduce), the job is stuck as ACCEPTED: waiting for AM container to be allocated, launched and register with RM.
I checked Ambari and the spark config for scheduling is FAIR. For testing, I tried to run 2 of the same spark job and it ran no problems (state is RUNNING on both). There should be enough cores and memory left for the map reduce job to run.
Spark Submit command:
/usr/hdp/current/spark-client/bin/spark-submit --class com.some.App --master yarn-cluster --deploy-mode cluster --num-executors 1 /path/to/file.jar "some.server:6667" "Some_App" "Some_App_Parser" "some.server" jdbc:jtds:sqlserver://some.server:1433/HL7_Metadata &; done
My sqoop command, I added the memory limit but it didn't help:
sqoop import -D --connect "jdbc:sqlserver://some.server\SQL2012;database=SomeDB;username=someUser;passwor =somePass" --e "SELECT SOMETHING" where \$CONDITIONS" --fields-terminated-by \002 --escaped-by \ --check-column Message_Audit_Log_Id --incremental append --last-value 1 --split-by Message_Audit_Log_Id --target-dir /target/path/
Here are some images for reference:
Created 11-08-2017 03:29 AM
Assuming the value of "yarn.scheduler.capacity.maximum-am-resource-percent" to be 0.2. Can you try increasing it to 0.3 or 0.4 and check if it works.
Yarn -> Configs -> Advanced -> Scheduler -> Capacity Scheduler
Created 11-07-2017 06:41 AM
Created on 11-07-2017 03:58 PM - edited 08-18-2019 12:38 AM
I do have a node manager, I will attach a screenshot.
Created 11-07-2017 04:11 PM
I meant to add another Node manager in addition to this. Add it in some other host and check if it goes to RUNNING state.
Created 11-07-2017 07:04 PM
Unfortunately I won't be able to add another Node Manager because we only have 1 host. Adding another host is not ideal for my situation.
Created 11-08-2017 03:29 AM
Assuming the value of "yarn.scheduler.capacity.maximum-am-resource-percent" to be 0.2. Can you try increasing it to 0.3 or 0.4 and check if it works.
Yarn -> Configs -> Advanced -> Scheduler -> Capacity Scheduler
Created 11-08-2017 01:44 PM
I'd like to thank you sir!!!!! Now both the spark job and the sqoop job can run at the same time. Can you explain to me what exactly this did?
Created 11-08-2017 02:04 PM
This value controls the maximum no of resources that can be used to run Application master. Also controls the no of concurrent applications running. If this is very low , application master may not even start which will cause the app to be in ACCEPTED state. If it is very high, application master takes all the resources leaving the application only few resources.
Created 01-07-2020 01:34 AM
Is the issue happens for a particular queue? Could you please let us know and is the issue happens for a partcular job alone?
It would be fine if you can share us the application logs, RM logs and fair scheduler.xml for further analysis.