I am using sqoop to import data from netezza, everything works fine but issue is it uses queue 'A'(which is mentioned in the sqoop import) for pulling data from nz and then while loading data into hive, it is using "default" queue. It is not a huge issue but i am having issues with yarn.scheduler.capacity.maximum-am-resource-percent where the queue is overloaded with different jobs. There seems to be 2 solutions here, one being increase the max am resource percent for the default queue or change the load to hive in the sqoop job to use queue 'A'.
how can i change it?
Sqoop using MapReduce as execution engine to read the data from Netezza and loading data into Hive is usually using "load data inpath" statement. To specify the queue for Sqoop job run, try passing -Dmapreduce.job.queuename=<queue_name>
yes, i did mention the -Dmapreduce.job.queuename=<queue_name> already but 2 applications run if you look at the yarn jobs list, first one uses the mentioned queue in the above property and second job uses default queue. I have no idea why it lauches 2 separate jobs.
i resolved this my configuring queue mappings and increasing the am resource percent.