Hi. I am having a "What the heck" moment. Could someone please explain the theory behind this. I have always presumed that Sqoop - unlike other MR processes that might require the entire dataset to be in memory to work - should not ever have a OOM issue. Afterall, it is using its memory as a buffer, copying the data from DB to the staging area in HDFS, and when complete, moving from staging to --target-dir.
So, we were moving a fairly large DB (500GB) but our client would only allow us to use 1 mapper (don't ask why...gulp). About 90 minutes into the process, it terminated with:
Container is running beyond the 'PHYSICAL' memory limit. Current usage: 1.0Gib of 1 GB physical memory used; 2.7GB of 2.1 GB virtual memory used. Killing container
This is really confusing me. I suppose I can solve the problem by
a) increasing the vmem/pmem ratio (yarn.nodemanager.vmem-pmem-ratio = xyz)
b) not checking for this error (yarn.nodemanager.vmem-check-enabled = false).
But WHY is this error coming up?
Thanks in advance and cheers.