Support Questions

Find answers, ask questions, and share your expertise

Hadoop - Sqoop job stuck on ACCEPTED when there is a spark job RUNNING

avatar

At the moment I have a spark job (java) that will always need to be running. It doesn't need too many resources. However, whenever I run a sqoop job (MapReduce), the job is stuck as ACCEPTED: waiting for AM container to be allocated, launched and register with RM.

I checked Ambari and the spark config for scheduling is FAIR. For testing, I tried to run 2 of the same spark job and it ran no problems (state is RUNNING on both). There should be enough cores and memory left for the map reduce job to run.

Spark Submit command:

/usr/hdp/current/spark-client/bin/spark-submit   --class com.some.App   --master yarn-cluster   --deploy-mode cluster   --num-executors 1   /path/to/file.jar "some.server:6667" "Some_App" "Some_App_Parser" "some.server"
jdbc:jtds:sqlserver://some.server:1433/HL7_Metadata
&; done

My sqoop command, I added the memory limit but it didn't help:

sqoop import -D mapreduce.map.memory.mb=2048     --connect "jdbc:sqlserver://some.server\SQL2012;database=SomeDB;username=someUser;passwor =somePass"     --e "SELECT SOMETHING" where  \$CONDITIONS"    --fields-terminated-by \002     --escaped-by \     --check-column Message_Audit_Log_Id     --incremental append     --last-value 1     --split-by Message_Audit_Log_Id     --target-dir /target/path/

Here are some images for reference:

43445-spark-1.png

43446-spark-2.png

43447-sqoop.png

43448-yarn-ui.png

1 ACCEPTED SOLUTION

avatar
Super Guru

@Kevin Nguyen,

Assuming the value of "yarn.scheduler.capacity.maximum-am-resource-percent" to be 0.2. Can you try increasing it to 0.3 or 0.4 and check if it works.

Yarn -> Configs -> Advanced -> Scheduler -> Capacity Scheduler

Thanks,

Aditya

View solution in original post

8 REPLIES 8

avatar
Super Guru

@Kevin Nguyen,

Can you try adding a Node manager and see if it goes to accepted state.

Thanks,

Aditya

avatar

I do have a node manager, I will attach a screenshot.

43459-yarn-ui-2.png

avatar
Super Guru

@Kevin Nguyen,

I meant to add another Node manager in addition to this. Add it in some other host and check if it goes to RUNNING state.

avatar

Unfortunately I won't be able to add another Node Manager because we only have 1 host. Adding another host is not ideal for my situation.

avatar
Super Guru

@Kevin Nguyen,

Assuming the value of "yarn.scheduler.capacity.maximum-am-resource-percent" to be 0.2. Can you try increasing it to 0.3 or 0.4 and check if it works.

Yarn -> Configs -> Advanced -> Scheduler -> Capacity Scheduler

Thanks,

Aditya

avatar

I'd like to thank you sir!!!!! Now both the spark job and the sqoop job can run at the same time. Can you explain to me what exactly this did?

avatar
Super Guru

@Kevin Nguyen,

This value controls the maximum no of resources that can be used to run Application master. Also controls the no of concurrent applications running. If this is very low , application master may not even start which will cause the app to be in ACCEPTED state. If it is very high, application master takes all the resources leaving the application only few resources.

avatar
Cloudera Employee

Hi,

 

Is the issue happens for a particular queue? Could you please let us know and is the issue happens for a partcular job alone?

 

It would be fine if you can share us the application logs, RM logs and fair scheduler.xml for further analysis.

 

Thanks

AKR