07-20-2017 02:26 PM
I have this very weird issue, using sqoop to pulling data from oracle to hdfs/hive, i have three jobs running, total records is 312,163,901, but it stops at 312,160,000, the jobs show as running, but it hangs and nohting in the log and no update.
A different table works fine.
To me somehow the last 3901 got stuck, can someone help or give some hints?
07-20-2017 07:34 PM
This is due to java heap space issue.
Pls try the below steps
1. Check the current mapreduce.map.memory.mb and mapreduce.reduce.memory.mb (there are different ways to check - you can use either Cloudera manager -> Yarn -> configuration (or) from hive/beeline CLI (or) mapred-site.xml or yarn-site.xml )
2. Increase (1 or 2 GB) java heap space temporarly, i've already shared the details in the below link. NOTE: The below link refers to different issue but you can use this soluation for your issue too as sqoop uses MR
3. Try the sqoop now, if the issue fixed then work with your admin and increase it permanently