Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Sqoop import failes due to lack of memory : fine tuning possibilities

avatar
Contributor
Hi Everyone,
I'm trying to import a table of 81MB from Oracle to HDFS (Hive Table). The same issue occurs Oracle to HDFS Filesystem (Without hiveimport). Its a two noded cluster using for poc purposes, one of the node contains 3gig memory and another contains 15gig memory. Is it possible to fine tune to the job to run on one node (15 Gig).. or run on both nodes with some memory adjustments on node1.
sqoop import --connect jdbc:oracle:thin:@oracledbhost:1521:VAEDEV --table WC_LOY_MEM_TXN --username OLAP -P -m 1
Diagnostics: Container [pid=10840,containerID=container_e05_1463664059655_0005_02_000001] is running beyond physical memory limits. Current usage: 269.4 MB of 256 MB physical memory used; 2.1 GB of 537.6 MB virtual memory used. Killing container. Dump of the process-tree for container_e05_1463664059655_0005_02_000001 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
1 ACCEPTED SOLUTION

avatar
Guru

You can increase memory on your mappers. Take a look at mapreduce.map.memory.mb, mapreduce.reduce.memory.mb, mapreduce.map.java.opts and mapreduce.reduce.java.opts.

I think your mapreduce.map.memory.mb is set to 256MB based on the error. I don't know what else is running on your 3GB node and what heap is given, but you maybe able to allocate 1GB of it to yarn (container memory). It is also possible to get it to run on 15GB node by using node labels. You can also switch off nodemanager on 3GB node if other processes are running no this, so it uses 15GB node.

View solution in original post

5 REPLIES 5

avatar
Guru

You can increase memory on your mappers. Take a look at mapreduce.map.memory.mb, mapreduce.reduce.memory.mb, mapreduce.map.java.opts and mapreduce.reduce.java.opts.

I think your mapreduce.map.memory.mb is set to 256MB based on the error. I don't know what else is running on your 3GB node and what heap is given, but you maybe able to allocate 1GB of it to yarn (container memory). It is also possible to get it to run on 15GB node by using node labels. You can also switch off nodemanager on 3GB node if other processes are running no this, so it uses 15GB node.

avatar
Contributor

job runs fine when i turned off the node manager on the node1. Thanks. I will search on how to run based on the node labels.

avatar
Super Collaborator

@elan chelian Can you try your sqoop command with increasing map memory from 256MB o higher value , like this :

sqoop import -D mapreduce.map.memory.mb=2048 -D mapreduce.map.java.opts=-Xmx1024m --connect jdbc:oracle:thin:@oracledbhost:1521:VAEDEV --table WC_LOY_MEM_TXN --username OLAP -P -m 1

avatar
Contributor

Thanks, Pradeep. now the error msg differs. Resource crunch on the node1 seems to the cause.

I will try disabling the node1 node manager and re-run the job.

16/05/25 13:24:20 INFO mapreduce.Job: Running job: job_1464166918626_0020 16/05/25 13:25:06 INFO mapreduce.Job: Job job_1464166918626_0020 running in uber mode : false 16/05/25 13:25:06 INFO mapreduce.Job: map 100% reduce 0% 16/05/25 13:25:12 INFO mapreduce.Job: Job job_1464166918626_0020 failed with state KILLED due to: MAP capability required is more than the supported max container capability in the cluster. Killing the Job. mapResourceRequest: <memory:2048, vCores:1> maxContainerCapability:<memory:1024, vCores:3> Job received Kill while in RUNNING state. 16/05/25 13:25:12 INFO mapreduce.Job: Counters: 6 Job Counters Killed map tasks=1 Total time spent by all maps in occupied slots (ms)=0 Total time spent by all reduces in occupied slots (ms)=0 Total time spent by all map tasks (ms)=0 Total vcore-seconds taken by all map tasks=0 Total megabyte-seconds taken by all map tasks=0 16/05/25 13:25:12 WARN mapreduce.Counters: Group FileSystemCounters is deprecated. Use org.apache.hadoop.mapreduce.FileSystemCounter instead 16/05/25 13:25:12 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 68.7027 seconds (0 bytes/sec) 16/05/25 13:25:12 WARN mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead 16/05/25 13:25:12 INFO mapreduce.ImportJobBase: Retrieved 0 records. 16/05/25 13:25:12 ERROR tool.ImportTool: Error during import: Import job failed!

avatar
Super Collaborator

@elan chelian this line tells that job is going on a machine that have 1GB of resource.

MAP capability required is more than the supported max container capability in the cluster. Killing the Job. mapResourceRequest: <memory:2048, vCores:1> maxContainerCapability:<memory:1024, vCores:3> Job received Kill while in R

You can reduce the container size to 1GB and run the sqoop import.

Try to run modified command:

sqoop import -D mapreduce.map.memory.mb=1024 -D mapreduce.map.java.opts=-Xmx768m --connect jdbc:oracle:thin:@oracledbhost:1521:VAEDEV --table WC_LOY_MEM_TXN --username OLAP -P -m 1