Member since
06-19-2014
16
Posts
6
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3935 | 09-01-2015 01:22 PM |
09-01-2015
01:22 PM
I found the container logs via the containers web UI (on Cloudera VM it is http://quickstart.cloudera:8042/node/allContainers) There are 2 containers for my application, first just shows the logs I was looking at earlier indicating whether the container succeeded or failed; second has many logs with useful info (command / errors / slider-agent / status_command). They are transient, but I was able to look at them before the application terminated. slider-agent.out just has this line in it: No handlers could be found for logger "root" However slider-agent.log gave me the info I was looking for, basically the stderr / stdout from executing the Java command line so that is very helpful. INFO 2015-08-19 14:07:28,422 AgentToggleLogger.py:40 - Queue result: {'componentStatus': [],
'reports': [{'actionId': u'4-1',
'clusterName': u'myapp1',
'exitcode': 1,
'reportResult': True,
'role': u'MYAPP',
'roleCommand': u'START',
'serviceName': u'myapp1',
'status': 'FAILED',
'stderr': '2015-08-19 14:07:28,268 - Error while executing command ...<removed for brevity>,
'stdout': '2015-08-19 14:07:23,261 - Execute[\'/usr/java/latest/bin/java -Xmx256m -classpath ...<removed for brevity>,
'structuredOut': '{}',
'taskId': 4}]} Locating the container logs put me on the path to solving this.
... View more
08-19-2015
11:13 AM
Only error in the log I see is this: Role instance RoleInstance failed 2015-08-19 10:59:21,819 [AMRM Callback Handler Thread] ERROR appmaster.SliderAppMaster - Role instance RoleInstance{role='MYAPP', id='container_1439926335194_0002_01_000003', container=ContainerID=container_1439926335194_0002_01_000003 nodeID=quickstart.cloudera:8041 http=quickstart.cloudera:8042 priority=1073741825 resource=<memory:1024, vCores:1>, createTime=1440007115649, startTime=1440007115674, released=false, roleId=1, host=quickstart.cloudera, hostURL=http://quickstart.cloudera:8042, state=5, placement=null, exitCode=0, command='python ./infra/agent/slider-agent/agent/main.py --label container_1439926335194_0002_01_000003___MYAPP --zk-quorum localhost:2181 --zk-reg-path /registry/users/myuser/services/org-apache-slider/myapp1> /slider-agent.out 2>&1 ; ', diagnostics='', output=null, environment=[LANGUAGE="en_US.UTF-8", AGENT_WORK_ROOT="$PWD", HADOOP_USER_NAME="C4", AGENT_LOG_ROOT="", PYTHONPATH="./infra/agent/slider-agent/", LC_ALL="en_US.UTF-8", SLIDER_PASSPHRASE="<redacted>", LANG="en_US.UTF-8"]} failed
... View more
08-18-2015
01:51 PM
I went through the Slider Memcached Tutorial and was able to package/deploy/start the memcached container successfully; however when I package up a custom application, basically a Java jar plus dependencies, the container never launches succssfully. The application page show the app is in a FINISHED/FAILED state with this diagnostic: http://quickstart.cloudera:8088/cluster/app/application_1439926335194_0001 Diagnostics: Unstable Application Instance : - failed with component MYAPP failed 'recently' 6 times (4 in startup); threshold is 5 - last failure: Failure container_1439926335194_0001_01_000008 on host quickstart.cloudera (0): http://quickstart.cloudera:19888/jobhistory/logs//quickstart.cloudera:8041/container_1439926335194_0001_01_000008/ctx/C4 Part of the challenge in diagnosing the issue with the container is that the logs disappear after the application completes. http://quickstart.cloudera:8042/node/containerlogs/container_1439926335194_0001_01_000001/MYUSER There is a troubleshooting page for slider which indicates that you can persist the logs beyond application completion: http://slider.incubator.apache.org/docs/troubleshooting.html Configuring YARN for better debugging One configuration to aid debugging is tell the nodemanagers to keep data for a short period after containers finish <!-- 10 minutes after a failure to see what is left in the directory-->
<property>
<name>yarn.nodemanager.delete.debug-delay-sec</name>
<value>600</value>
</property> And I found this setting in Yarn - Configuration - NodeManager Base Group - Advanced - Localized Dir Delection Delay and changed it from the default of 0 to 1200; however even after I deploy client config, and restart Nodemanager + Yarn, even restart the VM, the logs are still getting deleted on container completion. I'm working on the CDH 5.3.0 Vitrualbox VM image and the cluster + services appear to be working normally as I start up the package.
... View more
Labels:
07-14-2015
11:25 AM
1 Kudo
I'm on the 5.3.0 VM now and was able to locate the jar in the folder you identified, thanks /usr/lib/hadoop-mapreduce/
... View more
07-17-2014
01:03 PM
Wanted to try out some python on hadoop in my CDH5 VM but I need the streaming jar file. According to the documentation here http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Installation-Guide/c... the jar should be at this location: /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/hadoop-streaming-2.0.0-mr1-cdh<version>.jar or maybe here /usr/lib/hadoop-mapreduce/hadoop-streaming.jar but I don't see it at either place. is there a way to install additional components if they are missing?
... View more
Labels: