Support Questions

Find answers, ask questions, and share your expertise

Sqoop virtual memory error

avatar
Contributor

Hi.  I am having a "What the heck" moment.  Could someone please explain the theory behind this.  I have always presumed that Sqoop - unlike other MR processes that might require the entire dataset to be in memory to work - should not ever have a OOM issue.  Afterall, it is using its memory as a buffer, copying the data from DB to the staging area in HDFS, and when complete, moving from staging to --target-dir.

 

So, we were moving a fairly large DB (500GB) but our client would only allow us to use 1 mapper (don't ask why...gulp).  About 90 minutes into the process, it terminated with:

 

Container is running beyond the 'PHYSICAL' memory limit. Current usage: 1.0Gib of 1 GB physical memory used; 2.7GB of 2.1 GB virtual memory used. Killing container

 

This is really confusing me.  I suppose I can solve the problem by

a) increasing the vmem/pmem ratio (yarn.nodemanager.vmem-pmem-ratio = xyz)

OR

b) not checking for this error (yarn.nodemanager.vmem-check-enabled = false).

 

But WHY is this error coming up?  

 

Thanks in advance and cheers.

1 ACCEPTED SOLUTION

avatar
Mentor
One possibility could be the fetch size (combined with some unexpectedly
wide rows). Does lowering the result fetch size help?

>From http://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html#idp774390917888
:
--fetch-size Number of entries to read from database at once.

Also, do you always see it fail with the YARN memory kill (due to pmem
exhaustion) or do you also observe an actual java.lang.OutOfMemoryError
occasionally? If it is always the former, then another suspect would be
some off-heap memory use done by the JDBC driver in use, although I've not
come across such a problem.

View solution in original post

1 REPLY 1

avatar
Mentor
One possibility could be the fetch size (combined with some unexpectedly
wide rows). Does lowering the result fetch size help?

>From http://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html#idp774390917888
:
--fetch-size Number of entries to read from database at once.

Also, do you always see it fail with the YARN memory kill (due to pmem
exhaustion) or do you also observe an actual java.lang.OutOfMemoryError
occasionally? If it is always the former, then another suspect would be
some off-heap memory use done by the JDBC driver in use, although I've not
come across such a problem.