Support Questions
Find answers, ask questions, and share your expertise

LLAP Start Error after increase memory parameters

Highlighted

LLAP Start Error after increase memory parameters

Contributor

llap-start-error.txt

Increase next parameters: 

HiveServer Interactive Heap Size   10240   to   13251 (MB)
Memory per Daemon                  10240   to   13251 (MB)
LLAP Daemon Heap Size              8162    to   9794  (MB)
In-Memory Cache per Daemon         2048    to   2457  (MB) 
------
Not changed :
Memory allocated for all YARN containers on a node 164 GB
LLAP Daemon Container Max Headroom 1024  (MB)


stderr: 
2017-12-04 16:19:02,069 - LLAP app 'llap0' deployment unsuccessful.
Traceback (most recent call last):
  File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_server_interactive.py", line 680, in <module>
    HiveServerInteractive().execute()
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 314, in execute
    method(env)
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 762, in restart
    self.start(env, upgrade_type=upgrade_type)
  File "/var/lib/ambari-agent/cache/common-services/HIVE/0.12.0.2.0/package/scripts/hive_server_interactive.py", line 123, in start
    raise Fail("Skipping START of Hive Server Interactive since LLAP app couldn't be STARTED.")
resource_management.core.exceptions.Fail: Skipping START of Hive Server Interactive since LLAP app couldn't be STARTED.
11 REPLIES 11
Highlighted

Re: LLAP Start Error after increase memory parameters

Contributor

Please Help in investigation ..

Highlighted

Re: LLAP Start Error after increase memory parameters

Contributor

LLAP app has failed to start; application_1512395880314_0027 app logs might have a real error, if any. It's also possible that the containers failed to start because either there isn't enough memory physically on the cluster, or there isn't enough space configured in the YARN queue being used.

Highlighted

Re: LLAP Start Error after increase memory parameters

Cloudera Employee

@dmitro:

better if you can post the app logs or container level logs. They will have the exact error. TO me it seems like memory issue only and could be related to yarn container size.

you can get the app and container level logs this way:

yarn logs -applicationId application_1512395880314_0027

yarn logs -containerId container_e115_1512395880314_0027_01_000006

For a full trace for app and container:

yarn logs -applicationId application_1512395880314_0027 -containerId container_e115_1512395880314_0027_01_000014

Highlighted

Re: LLAP Start Error after increase memory parameters

Contributor
Thanks you very much!

Now see error :

"exec /usr/jdk64/jdk1.8.0_77/bin/java -Dproc_llapdaemon -Xms9794m -Xmx9794m -Dhttp.maxConnections=38 -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:TLABSize=8m -XX:+ResizeTLAB -XX:+UseNUMA 
-XX:+AggressiveOpts -XX:MetaspaceSize=1024m -XX:InitiatingHeapOccupancyPercent=80 -XX:MaxGCPauseMillis=600 -Xmx8192m -XX:MetaspaceSize=1024m -server -Djava.net.preferIPv4Stack=true 
-XX:+UseNUMA -XX:+PrintGCDetails -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=4 -XX:GCLogFileSize=100M -XX:+PrintGCDateStamps 
-Xloggc:/grid/1/hadoop/yarn/log/application_1512395880314_0027/container_e115_1512395880314_0027_01_000006//gc_2017-12-04-16.log  *** "


Error occurred during initialization of VM
Initial heap size set to a larger value than the maximum heap size   <<<<<<<<<<<<<<<<<
---------------------------------
Think,  need config "LLAP app java opts",   increase -Xmx8192m   to  -Xmx16384m  for example,  this is max value ?   

"LLAP app java opts" :
-XX:+AlwaysPreTouch {% if java_version > 7 %}-XX:+UseG1GC -XX:TLABSize=8m -XX:+ResizeTLAB -XX:+UseNUMA -XX:+AggressiveOpts -XX:MetaspaceSize=1024m 
-XX:InitiatingHeapOccupancyPercent=80 -XX:MaxGCPauseMillis=200{% else %}-XX:+PrintGCDetails -verbose:gc 
-XX:+PrintGCTimeStamps -XX:+UseNUMA -XX:+UseParallelGC{% endif %} -Xmx8192m 

Highlighted

Re: LLAP Start Error after increase memory parameters

Contributor

Note that the commandline has two Xmx values... Ambari has a config value for LLAP Xmx that it adds to the command line; that is the recommended way to set xmx. The custom value(s) added to the args seem to be conflicting to what Ambari adds.

Re: LLAP Start Error after increase memory parameters

Contributor

Problem resolved after config "LLAP app java opts", select working, but I have error in application log :

Application Container Diagnostics Container IDComponentStateExit CodeLogsDiagnostics container_e119_1512480218177_0094_01_000002LLAP4-104Logs Container [pid=15416,containerID=container_e119_1512480218177_0094_01_000002] is running beyond physical memory limits. Current usage: 27.0 GB of 26 GB physical memory used; 36.1 GB of 54.6 GB virtual memory used. Killing container. Dump of the process-tree for container_e119_1512480218177_0094_01_000002 : |- PID PPID PGRPID SESSID CMD_NAME USER_MODE_TIME(MILLIS) SYSTEM_TIME(MILLIS) VMEM_USAGE(BYTES) RSSMEM_USAGE(PAGES) FULL_CMD_LINE |- 15919 1 15416 15416 (java) 148737 3440 38182346752 7082375 /usr/jdk64/jdk1.8.0_77/bin/java -Dproc_llapdaemon -Xms22251m -Xmx22251m -Dhttp.maxConnections=38 -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:TLABSize=8m -XX:+ResizeTLAB -XX:+UseNUMA -XX:+AggressiveOpts -XX:MetaspaceSize=1024m -XX:InitiatingHeapOccupancyPercent=80 -XX:MaxGCPauseMillis=200 -Xmx33251m -XX:MetaspaceSize=1024m -server -Djava.net.preferIPv4Stack=true -XX:+UseNUMA -XX:+PrintGCDetails -verbose:gc -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=4 -XX:GCLogFileSize=100M -XX:+PrintGCDateStamps -Xloggc:/grid/11/hadoop/yarn/log/application_1512480218177_0094/container_e119_1512480218177_0094_01_000002//gc_2017-12-05-16.log -Djava.io.tmpdir=/grid/6/hadoop/yarn/local/usercache/hive/appcache/application_1512480218177_0094/container_e119_1512480218177_0094_01_000002/tmp/ -Dlog4j.configurationFile=llap-daemon-log4j2.properties -Dllap.daemon.log.dir=/grid/11/hadoop/yarn/log/application_1512480218177_0094/container_e119_1512480218177_0094_01_000002/ -Dllap.daemon.log.file=llap-daemon-hive-ks-dmp12.kyivstar.ua.log -Dllap.daemon.root.logger=query-routing -Dllap.daemon.log.level=DEBUG -classpath /grid/6/hadoop/yarn/local/usercache/hive/appcache/application_1512480218177_0094/container_e119_1512480218177_0094_01_000002/app/install//conf/: /grid/6/hadoop/yarn/local/usercache/hive/appcache/application_1512480218177_0094/container_e119_1512480218177_0094_01_000002/app/install//lib/*: /grid/6/hadoop/yarn/local/usercache/hive/appcache/application_1512480218177_0094/container_e119_1512480218177_0094_01_000002/app/install//lib/tez/*: /grid/6/hadoop/yarn/local/usercache/hive/appcache/application_1512480218177_0094/container_e119_1512480218177_0094_01_000002/app/install//lib/udfs/*:.: org.apache.hadoop.hive.llap.daemon.impl.LlapDaemon |- 15416 15414 15416 15416 (bash) 0 0 118026240 370 /bin/bash -c python ./infra/agent/slider-agent/agent/main.py --label container_e119_1512480218177_0094_01_000002___LLAP --zk-quorum ks-dmp03.kyivstar.ua:2181,ks-dmp01.kyivstar.ua:2181,ks-dmp02.kyivstar.ua:2181 --zk-reg-path /registry/users/hive/services/org-apache-slider/llap0 > /grid/11/hadoop/yarn/log/application_1512480218177_0094/container_e119_1512480218177_0094_01_000002/slider-agent. out 2>&1 |- 15427 15416 15416 15416 (python) 272 49 459268096 4579 python ./infra/agent/slider-agent/agent/main.py --label container_e119_1512480218177_0094_01_000002___LLAP --zk-quorum ks-dmp03.kyivstar.ua:2181,ks-dmp01.kyivstar.ua:2181,ks-dmp02.kyivstar.ua:2181 --zk-reg-path /registry/users/hive/services/org-apache-slider/llap0 Container killed on request. Exit code is 143 Container exited with a non-zero exit code 143.

Highlighted

Re: LLAP Start Error after increase memory parameters

Cloudera Employee

@Dmitro Vasilenko

In the error log avoe, it says memory issue: [pid=15416,containerID=container_e119_1512480218177_0094_01_000002] is running beyond physical memory limits. Current usage: 27.0 GB of 26 GB physical memory used;

I think the memoryh settings for llap daemon are beyond the physical available memory. Please check.

Highlighted

Re: LLAP Start Error after increase memory parameters

Contributor

Hi! How resolve error LLAP ?:

[pid=15416,containerID=container_e119_1512480218177_0094_01_000002] is running beyond physical memory limits. Current usage: 27.0 GB of 26 GB physical memory used; 36.1 GB of 54.6 GB virtual memory used. Killing container

Highlighted

Re: LLAP Start Error after increase memory parameters

Contributor

In the above log, it looks like there are two xmx values on the commandline: -Xmx22251m and -Xmx33251m. Do you know where the 2nd value comes from? Was one of them specified via args? I'm not sure which one would apply (it would be logged in jmx view of the LLAP daemon), but if the limit is 27Gb and the 2nd value applies then this is the reason for the container to exceed memory.