Created 06-18-2021 06:09 AM
Hello Everyone,
In my CDP setup, I'm not able to run Analyze table commands on external tables.
It first doesn't start a YARN application immediately, and after 5-10 mins, when it starts the application, it gets killed within 10 seconds with the following error stacktrace.
ERROR : Status: Failed
ERROR : Application application_1623850591633_0042 failed 2 times due to AM Container for appattempt_1623850591633_0042_000002 exited with exitCode: -104
Failing this attempt.Diagnostics: [2021-06-18 18:32:52.722]Container [pid=32822,containerID=container_e49_1623850591633_0042_02_000001] is running 34230272B beyond the 'PHYSICAL' memory limit. Current usage: 2.0 GB of 2 GB physical memory used; 3.9 GB of 4.2 GB virtual memory used. Killing container.
Dump of the process-tree for container_e49_1623850591633_0042_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 32829 32822 32822 32822 (java) 3070 3010 4117856256 532273 /usr/java/default/bin/java -Xmx1638m -Djava.io.tmpdir=/data8/yarn/nm/usercache/hive/appcache/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -XX:+PrintGCDetails -verbose:gc -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel= org.apache.tez.dag.app.DAGAppMaster --session
|- 32822 32819 32822 32822 (bash) 0 0 118185984 372 /bin/bash -c /usr/java/default/bin/java -Xmx1638m -Djava.io.tmpdir=/data8/yarn/nm/usercache/hive/appcache/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -XX:+PrintGCDetails -verbose:gc -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster --session 1>/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/stdout 2>/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/stderr
[2021-06-18 18:32:52.731]Container killed on request. Exit code is 143
[2021-06-18 18:32:52.731]Container exited with a non-zero exit code 143.
For more detailed output, check the application tracking page: http://<yarn-hostname>:8088/cluster/app/application_1623850591633_0042 Then click on links to logs of each attempt.
. Failing the application.
ERROR : FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Application application_1623850591633_0042 failed 2 times due to AM Container for appattempt_1623850591633_0042_000002 exited with exitCode: -104
Failing this attempt.Diagnostics: [2021-06-18 18:32:52.722]Container [pid=32822,containerID=container_e49_1623850591633_0042_02_000001] is running 34230272B beyond the 'PHYSICAL' memory limit. Current usage: 2.0 GB of 2 GB physical memory used; 3.9 GB of 4.2 GB virtual memory used. Killing container.
Dump of the process-tree for container_e49_1623850591633_0042_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 32829 32822 32822 32822 (java) 3070 3010 4117856256 532273 /usr/java/default/bin/java -Xmx1638m -Djava.io.tmpdir=/data8/yarn/nm/usercache/hive/appcache/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -XX:+PrintGCDetails -verbose:gc -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel= org.apache.tez.dag.app.DAGAppMaster --session
|- 32822 32819 32822 32822 (bash) 0 0 118185984 372 /bin/bash -c /usr/java/default/bin/java -Xmx1638m -Djava.io.tmpdir=/data8/yarn/nm/usercache/hive/appcache/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -XX:+PrintGCDetails -verbose:gc -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster --session 1>/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/stdout 2>/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/stderr
[2021-06-18 18:32:52.731]Container killed on request. Exit code is 143
[2021-06-18 18:32:52.731]Container exited with a non-zero exit code 143.
For more detailed output, check the application tracking page: http://<yarn-hostname>:8088/cluster/app/application_1623850591633_0042 Then click on links to logs of each attempt.
. Failing the application.
INFO : Completed executing command(queryId=hive_20210618183003_95b31532-fda8-4d10-bb14-bfcb1e833ca7); Time taken: 17.503 seconds
INFO : OK
Error: Error while compiling statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Application application_1623850591633_0042 failed 2 times due to AM Container for appattempt_1623850591633_0042_000002 exited with exitCode: -104
Failing this attempt.Diagnostics: [2021-06-18 18:32:52.722]Container [pid=32822,containerID=container_e49_1623850591633_0042_02_000001] is running 34230272B beyond the 'PHYSICAL' memory limit. Current usage: 2.0 GB of 2 GB physical memory used; 3.9 GB of 4.2 GB virtual memory used. Killing container.
Dump of the process-tree for container_e49_1623850591633_0042_02_000001 :
|- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE
|- 32829 32822 32822 32822 (java) 3070 3010 4117856256 532273 /usr/java/default/bin/java -Xmx1638m -Djava.io.tmpdir=/data8/yarn/nm/usercache/hive/appcache/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -XX:+PrintGCDetails -verbose:gc -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel= org.apache.tez.dag.app.DAGAppMaster --session
|- 32822 32819 32822 32822 (bash) 0 0 118185984 372 /bin/bash -c /usr/java/default/bin/java -Xmx1638m -Djava.io.tmpdir=/data8/yarn/nm/usercache/hive/appcache/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/tmp -server -Djava.net.preferIPv4Stack=true -XX:+PrintGCDetails -verbose:gc -XX:+UseNUMA -XX:+UseG1GC -XX:+ResizeTLAB -XX:+HeapDumpOnOutOfMemoryError -XX:HeapDumpPath=/tmp -Dlog4j.configuratorClass=org.apache.tez.common.TezLog4jConfigurator -Dlog4j.configuration=tez-container-log4j.properties -Dyarn.app.container.log.dir=/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001 -Dtez.root.logger=INFO,CLA -Dsun.nio.ch.bugLevel='' org.apache.tez.dag.app.DAGAppMaster --session 1>/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/stdout 2>/data1/yarn/container-logs/application_1623850591633_0042/container_e49_1623850591633_0042_02_000001/stderr
[2021-06-18 18:32:52.731]Container killed on request. Exit code is 143
[2021-06-18 18:32:52.731]Container exited with a non-zero exit code 143.
For more detailed output, check the application tracking page: http://<yarn-hostname>:8088/cluster/app/application_1623850591633_0042 Then click on links to logs of each attempt.
. Failing the application. (state=08S01,code=2)
This happens irrespective of the size of the table.
Any ideas about this?
Thanks,
Megh
Created 06-20-2021 06:16 AM
Hello @vidanimegh
Error: Error while compiling statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Application application_1623850591633_0042 failed 2 times due to AM Container for appattempt_1623850591633_0042_000002 exited with exitCode: -104 Failing this attempt.Diagnostics: [2021-06-18 18:32:52.722]Container [pid=32822,containerID=container_e49_1623850591633_0042_02_000001] is running 34230272B beyond the 'PHYSICAL' memory limit. Current usage: 2.0 GB of 2 GB physical memory used; 3.9 GB of 4.2 GB virtual memory used. Killing container.
As I can see your jobs are getting failed with PHYSICAL memory limit error. Could you set the below property in beeline session level and re-run the analysis query and see how it goes.
set hive.tez.container.size=8192;
set hive.tez.java.opts=-Xmx6553;
set tez.runtime.io.sort.mb=3072;
set tez.task.resource.memory.mb=8192;
set tez.am.resource.memory.mb=8192;
set tez.am.launch.cmd-opts=-Xmx6553m;
Created 06-29-2021 06:21 AM
Hi @Shifu ,
So supplying these config properties at runtime didn't work but changing the service configuration to modify the below parameters did the job for me:
set tez.runtime.io.sort.mb=3072;
set tez.task.resource.memory.mb=8192;
set tez.am.resource.memory.mb=8192;
set tez.am.launch.cmd-opts=-Xmx6553m;
Not sure why that might be the case, but the issue seems to have been fixed.
Thanks,
Megh
Created 06-20-2021 06:16 AM
Hello @vidanimegh
Error: Error while compiling statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Application application_1623850591633_0042 failed 2 times due to AM Container for appattempt_1623850591633_0042_000002 exited with exitCode: -104 Failing this attempt.Diagnostics: [2021-06-18 18:32:52.722]Container [pid=32822,containerID=container_e49_1623850591633_0042_02_000001] is running 34230272B beyond the 'PHYSICAL' memory limit. Current usage: 2.0 GB of 2 GB physical memory used; 3.9 GB of 4.2 GB virtual memory used. Killing container.
As I can see your jobs are getting failed with PHYSICAL memory limit error. Could you set the below property in beeline session level and re-run the analysis query and see how it goes.
set hive.tez.container.size=8192;
set hive.tez.java.opts=-Xmx6553;
set tez.runtime.io.sort.mb=3072;
set tez.task.resource.memory.mb=8192;
set tez.am.resource.memory.mb=8192;
set tez.am.launch.cmd-opts=-Xmx6553m;
Created 06-28-2021 03:46 AM
Hi @Shifu ,
Tried the configuration parameters given by you, still facing the same error.
Thanks,
Megh
Created 06-29-2021 06:21 AM
Hi @Shifu ,
So supplying these config properties at runtime didn't work but changing the service configuration to modify the below parameters did the job for me:
set tez.runtime.io.sort.mb=3072;
set tez.task.resource.memory.mb=8192;
set tez.am.resource.memory.mb=8192;
set tez.am.launch.cmd-opts=-Xmx6553m;
Not sure why that might be the case, but the issue seems to have been fixed.
Thanks,
Megh