Try setting that value to 10 for example and retry.
Do not modify the hardware specs. Just modify in Yarn the number of vCPU allocated.
You can to some extend "overcommit" the vCPU allocated to Yarn. For some testing cluster you should be fine.
Also, be sure you have allocated enough ram to Yarn.
I tried below solution it works perfectly for me.
1) Change the Hadoop schedule type from capacity scheduler to fair scheduler. Because for small cluster each queue assign some memory size (2048MB) to complete single map reduce job. If more than one map reduce job run in single queue mean it met deadlock.
Solution: add below property to yarn-site.xml
<property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> </property> <property> <name>yarn.scheduler.fair.allocation.file</name> <value>file:/%HADOOP_HOME%/etc/hadoop/fair-scheduler.xml</value> </property>
2) By default Hadoop Total memory size was allot as 8GB.
So if we run two mapreduce program memory used by Hadoop get more than 8GB so it met deadlock.
Solution: Increase the size of Total Memory of nodemanager using following properties at yarn-site.xml
<property> <name>yarn.nodemanager.resource.memory-mb</name> <value>20960</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>1024</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>2048</value> </property>
So If user try to run more than two mapreduce program mean he need to increase nodemanager or he need to increase the size of total memory of Hadoop (note: Increasing the size will reduce the system usage memory. Above property file able to run 10 map reduce program concurrently.)
Your suggestion worked for me as well. I'm still wondering how a simple job which is expected to receive less than 10 records got stuck.
Oozie uses an MR job to execute the actual action logic:
In your case when you execute the sqoop action in command line, Sqoop executes an MR job and uses two containers to complete the job (A container for the ApplicationMaster and one for the Mapper).
When you execute the same thing from Oozie, it uses four containers:
One AM and a Mapper for the Launcher in Oozie and Sqoop launches an other AM and a Mapper to do the actual job.
You may reduce the AM and Mapper memory for the Launchers as they don't need much memory ( except if you use Spark action with yarn-client mode that rund Spark in the Launcher Mapper)
All the best,