Support Questions
Find answers, ask questions, and share your expertise

Memory limits issue while loading partitioned tables

Memory limits issue while loading partitioned tables

New Contributor

Hi

 

I am loading 2 partitioned tables into hive/impala. The way i am doing this is by first creating and loading the non-partitioned tables and then doing an INSERT into TABLE xxx PARTITION (Column) select col1, col2,... from non-partitioned-table;

 

But i see the following error. Can someone please help figure out what could be going on?

 

Thanks.
Farah

 

org.apache.hadoop.hive.shims.HadoopShimsSecure: Can't fetch tasklog: TaskLogServlet is not supported in MR2 mode.
2015-06-02 16:34:24,354 ERROR org.apache.hadoop.hive.ql.exec.Task:
Task with the most failures(4):
-----
Task ID:
  task_1433186868794_0018_r_000013

URL:
  http://ash-r101-15l.mstrprime.com:8088/taskdetails.jsp?jobid=job_1433186868794_0018&tipid=task_14331...
-----
Diagnostic Messages for this Task:
Container [pid=8493,containerID=container_1433186868794_0018_01_002138] is running beyond physical memory limits. Current usage: 1.0 GB of 1 GB physical memory used; 1.7 GB of 2.1 GB virtual memory used. Killing container.
Dump of the process-tree for container_1433186868794_0018_01_002138 :
    |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
    |- 8493 8485 8493 8493 (bash) 0 0 108654592 305 /bin/bash -c /usr/java/jdk1.7.0_67-cloudera/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN  -Djava.net.preferIPv4Stack=true -Xmx825955249 -Djava.io.tmpdir=/MSTR/yarn/nm/usercache/hive/appcache/application_1433186868794_0018/container_1433186868794_0018_01_002138/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1433186868794_0018/container_1433186868794_0018_01_002138 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.242.101.27 57709 attempt_1433186868794_0018_r_000013_3 2138 1>/var/log/hadoop-yarn/container/application_1433186868794_0018/container_1433186868794_0018_01_002138/stdout 2>/var/log/hadoop-yarn/container/application_1433186868794_0018/container_1433186868794_0018_01_002138/stderr  
    |- 8508 8493 8493 8493 (java) 5638 7494 1706004480 262837 /usr/java/jdk1.7.0_67-cloudera/bin/java -Djava.net.preferIPv4Stack=true -Dhadoop.metrics.log.level=WARN -Djava.net.preferIPv4Stack=true -Xmx825955249 -Djava.io.tmpdir=/MSTR/yarn/nm/usercache/hive/appcache/application_1433186868794_0018/container_1433186868794_0018_01_002138/tmp -Dlog4j.configuration=container-log4j.properties -Dyarn.app.container.log.dir=/var/log/hadoop-yarn/container/application_1433186868794_0018/container_1433186868794_0018_01_002138 -Dyarn.app.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA org.apache.hadoop.mapred.YarnChild 10.242.101.27 57709 attempt_1433186868794_0018_r_000013_3 2138

Container killed on request. Exit code is 143
Container exited with a non-zero exit code 143


2015-06-02 16:34:24,378 INFO org.apache.hadoop.yarn.client.api.impl.YarnClientImpl: Killed application application_1433186868794_0018
2015-06-02 16:34:27,647 ERROR org.apache.hadoop.hive.ql.Driver: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
2015-06-02 16:34:27,648 INFO org.apache.hadoop.hive.ql.log.PerfLogger: </PERFLOG method=Driver.execute start=1433259105794 end=1433262867648 duration=3761854 from=org.apache.hadoop.hive.ql.Driver>
2015-06-02 16:34:27,648 INFO org.apache.hadoop.hive.ql.Driver: MapReduce Jobs Launched:
2015-06-02 16:34:27,648 INFO org.apache.hadoop.hive.ql.Driver: Stage-Stage-1: Map: 480  Reduce: 120   Cumulative CPU: 42936.02 sec   HDFS Read: 128752801581 HDFS Write: 0 FAIL
2015-06-02 16:34:27,648 INFO org.apache.hadoop.hive.ql.Driver: Total MapReduce CPU Time Spent: 0 days 11 hours 55 minutes 36 seconds 20 msec
2015-06-02 16:34:27,648 INFO org.apache.hadoop.hive.ql.log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2015-06-02 16:34:27,648 INFO ZooKeeperHiveLockManager:  about to release lock for tuto1tbtext/t_order_detail
2015-06-02 16:34:27,932 INFO ZooKeeperHiveLockManager:  about to release lock for tuto1tbtext/intermediate_t_order_detail
2015-06-02 16:34:28,165 INFO ZooKeeperHiveLockManager:  about to release lock for tuto1tbtext
2015-06-02 16:34:28,423 INFO org.apache.hadoop.hive.ql.log.PerfLogger: </PERFLOG method=releaseLocks start=1433262867648 end=1433262868423 duration=775 from=org.apache.hadoop.hive.ql.Driver>
2015-06-02 16:34:28,424 ERROR org.apache.hive.service.cli.operation.Operation: Error running hive query:
org.apache.hive.service.cli.HiveSQLException: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
    at org.apache.hive.service.cli.operation.SQLOperation.runInternal(SQLOperation.java:147)
    at org.apache.hive.service.cli.operation.SQLOperation.access$000(SQLOperation.java:69)
    at org.apache.hive.service.cli.operation.SQLOperation$1$1.run(SQLOperation.java:200)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
    at org.apache.hadoop.hive.shims.HadoopShimsSecure.doAs(HadoopShimsSecure.java:502)
    at org.apache.hive.service.cli.operation.SQLOperation$1.run(SQLOperation.java:213)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
    at java.util.concurrent.FutureTask.run(FutureTask.java:262)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
    at java.lang.Thread.run(Thread.java:745)
2015-06-02 16:34:28,444 INFO org.apache.hadoop.hive.ql.log.PerfLogger: <PERFLOG method=releaseLocks from=org.apache.hadoop.hive.ql.Driver>
2015-06-02 16:34:28,445 INFO org.apache.hadoop.hive.ql.log.PerfLogger: </PERFLOG method=releaseLocks start=1433262868444 end=1433262868445 duration=1 from=org.apache.hadoop.hive.ql.Driver>
2015-06-02 16:34:28,598 INFO org.apache.zookeeper.ZooKeeper: Session: 0x14db0970be51172 closed
2015-06-02 16:34:28,598 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down

2 REPLIES 2

Re: Memory limits issue while loading partitioned tables

Contributor

Hi, Farah,

 

Are you still having this problem?

Could you share the actual INSERT statements you're using to load the partitioned tables?

 

André 

Re: Memory limits issue while loading partitioned tables

Champion Alumni

Hello,

 

I"m having the same error. 

 

My query (anonymised)

WITH alias_table_name as (SELECT
myUDFName(oneField) as oneField_map ,
a, b, c, d, e
FROM table_source_name
GROUP BY a,b,c,d,e,oneField)

INSERT INTO TABLE table_target_name
SELECT DISTINCT
concat('myprefix|',a,'|',substring(d,0,10),'|',c,'|',
nvl(oneField_map["key_a"],""),'|',
nvl(oneField_map["key_b"],""),'|',
nvl(oneField_map["key_c"],""),'|',
nvl(oneField_map["key_d"],""),'|',
nvl(b,"")) as key,
a ,
c ,
b ,
nvl(oneField_map["key_a"],null) as key_a ,
nvl(oneField_map["key_b"],null) as key_b ,
nvl(oneField_map["key_e"],null) as key_e ,
nvl(oneField_map["key_c"],null) as key_c ,
nvl(oneField_map["key_d"],null) as key_d ,
"myprefix" as fieldName
FROM alias_table_name
GHERMAN Alina