Created on 12-10-2015 12:55 PM - edited 09-16-2022 02:52 AM
We have a Sqoop 1 query that is throwing a " Error: Java heap space" message on both our Sqoop driver and the Map/Reduce jobs running under Yarn. We were able to increase the Sqoop driver heap by setting HADOOP_HEAPSIZE to 2MB and that has solved the initial issue. It looks like the way the scripts work, you just need to pass in the megabytes and the scrpit prefixes -Xms and adds 'm' at the end.
export HADOOP_HEAPSIZE=2000
sqoop import ......
However, we can't find the correct place to set what we presume is the container memory and actual task process heap size configuration. Our cluster is currently configured with the following settings for Yarn. These are set via Cloudera Manager and are stored in the mapred-site.xml file. We don't want to adjust the entire cluster setting as these work fine for 99% of the jobs we run. We just have one problem child that we'd like to tune.
mapreduce.map.memory.mb=1024
mapreduce.map.java.opts=-Djava.net.preferIPv4Stack=true -Xmx825955249
We have tried the following without any luck. Is there any other suggestions for where we should be configuring these two settings for Sqoop 1 initiated jobs?
export HADOOP_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m"
export HADOOP_CLIENT_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m"
export YARN_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m"
export YARN_CLIENT_OPTS="-Dmapreduce.map.memory.mb=2000 -Dmapreduce.map.java.opts=-Xmx1500m"
sqoop import ....
Created 12-10-2015 01:21 PM
As it so often happens, I went out for a walk and came back to look at a few other things. And sure enough, I now see this would be how you tune the job. I guess if any good can come from my lack of attention to detail, at least I now have it engraved in my mind.
sqoop import -D mapreduce.map.memory.mb=4096 -D mapreduce.map.java.opts=-Xmx3000m ....
Created 12-10-2015 01:21 PM
As it so often happens, I went out for a walk and came back to look at a few other things. And sure enough, I now see this would be how you tune the job. I guess if any good can come from my lack of attention to detail, at least I now have it engraved in my mind.
sqoop import -D mapreduce.map.memory.mb=4096 -D mapreduce.map.java.opts=-Xmx3000m ....