Member since
06-19-2014
16
Posts
6
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2131 | 09-01-2015 01:22 PM |
09-25-2015
11:14 AM
Looking at the status, I see that the process in sleeping state; whatever diagnostic Cloudera Manager is using to determine whether Navigator is working is saying that it is functional even though it is unresponsive. [cloudera@quickstart ~]$ sudo netstat -tulpn | grep 7187 [cloudera@quickstart ~]$ sudo cat /proc/9994/status Name: java State: S (sleeping)
... View more
09-24-2015
07:28 PM
Looking through the log further I see the dup key error in previous days before Navigator stopped working so this may not be the cause.
... View more
09-23-2015
04:10 PM
Used to be able to do a metadata query through the Navigator web page but now I just get the spinner. /var/log/cloudera-scm-navigator mgmt-cmf-mgmt-NAVIGATOR-quickstart.cloudera.log.out Looking through the log I find this error: 2015-09-23 14:58:04,267 ERROR com.cloudera.navigator.NavigatorEntityManager: Error running command for table NAVMS_AUDIT_EVENTS
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Duplicate key name 'IDX_DISALLOWED_NAVMS_AUDITS'
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:377)
at com.mysql.jdbc.Util.getInstance(Util.java:360)
at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:978)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3887)
at com.mysql.jdbc.MysqlIO.checkErrorPacket(MysqlIO.java:3823) and other similar errors: 2015-09-23 14:47:14,448 ERROR com.cloudera.navigator.NavigatorEntityManager: Error running command for table SOLR_AUDIT_EVENTS
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Duplicate key name 'IDX_DISALLOWED_SOLR_AUDITS'
2015-09-23 14:58:04,294 ERROR com.cloudera.navigator.NavigatorEntityManager: Error running command for table SENTRY_AUDIT_EVENTS
com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Duplicate key name 'IDX_DISALLOWED_SENTRY_AUDITS'
... View more
- Tags:
- navigator
09-01-2015
01:22 PM
I found the container logs via the containers web UI (on Cloudera VM it is http://quickstart.cloudera:8042/node/allContainers ) There are 2 containers for my application, first just shows the logs I was looking at earlier indicating whether the container succeeded or failed; second has many logs with useful info (command / errors / slider -agent / status_command). They are transient, but I was able to look at them before the application terminated. slider -agent.out just has this line in it: No handlers could be found for logger "root" However slider -agent.log gave me the info I was looking for, basically the stderr / stdout from executing the Java command line so that is very helpful. INFO 2015-08-19 14:07:28,422 AgentToggleLogger.py:40 - Queue result: {'componentStatus': [],
'reports': [{'actionId': u'4-1',
'clusterName': u'myapp1',
'exitcode': 1,
'reportResult': True,
'role': u'MYAPP',
'roleCommand': u'START',
'serviceName': u'myapp1',
'status': 'FAILED',
'stderr': '2015-08-19 14:07:28,268 - Error while executing command ...<removed for brevity>,
'stdout': '2015-08-19 14:07:23,261 - Execute[\'/usr/java/latest/bin/java -Xmx256m -classpath ...<removed for brevity>,
'structuredOut': '{}',
'taskId': 4}]} Locating the container logs put me on the path to solving this.
... View more
09-01-2015
12:18 PM
If I'm in Navigator, this gives me all entities that have been tagged with "tag2": http://localhost:7187/?view=resultsView&facets=%7B%22tags%22%3A%5B%22tag2%22%5D%7D If I use the RESTful API, this query gets me all the tags that were used - but 0 count (see below) because no entities were returned: http://localhost:7187/api/v1/interactive/entities?facetFields=tags How do I query the RESTful API to get back all entities with tags? Also I would like to restrict the query by last modified but that filter doesn't seem to be working. I see the API doc page here, but examples are limited: http://cloudera.github.io/navigator/apidocs/v1/ {
"offset": 0,
"totalMatched": 0,
"limit": 100,
"results": [],
"highlighting": null,
"facets": {
"tags": {
"sample": 0,
"sometag": 0,
"tag1": 0,
"tag2": 0,
"tag3": 0
}
},
"facetQueries": null,
"facetRanges": [],
"qtime": 1
}
... View more
08-19-2015
11:13 AM
Only error in the log I see is this: Role instance RoleInstance failed 2015-08-19 10:59:21,819 [AMRM Callback Handler Thread] ERROR appmaster.SliderAppMaster - Role instance RoleInstance{role='MYAPP', id='container_1439926335194_0002_01_000003', container=ContainerID=container_1439926335194_0002_01_000003 nodeID=quickstart.cloudera:8041 http=quickstart.cloudera:8042 priority=1073741825 resource=<memory:1024, vCores:1>, createTime=1440007115649, startTime=1440007115674, released=false, roleId=1, host=quickstart.cloudera, hostURL=http://quickstart.cloudera:8042, state=5, placement=null, exitCode=0, command='python ./infra/agent/slider-agent/agent/main.py --label container_1439926335194_0002_01_000003___MYAPP --zk-quorum localhost:2181 --zk-reg-path /registry/users/myuser/services/org-apache-slider/myapp1> /slider-agent.out 2>&1 ; ', diagnostics='', output=null, environment=[LANGUAGE="en_US.UTF-8", AGENT_WORK_ROOT="$PWD", HADOOP_USER_NAME="C4", AGENT_LOG_ROOT="", PYTHONPATH="./infra/agent/slider-agent/", LC_ALL="en_US.UTF-8", SLIDER_PASSPHRASE="<redacted>", LANG="en_US.UTF-8"]} failed
... View more
- Tags:
- roleinstance
08-18-2015
01:59 PM
May have been because HDFS was corrupted. To check if your HDFS is healthy, run this as hdfs user: hdfs fsck /
... View more
08-18-2015
01:51 PM
I went through the Slider Memcached Tutorial and was able to package/deploy/start the memcached container successfully; however when I package up a custom application, basically a Java jar plus dependencies, the container never launches succssfully. The application page show the app is in a FINISHED/FAILED state with this diagnostic: http://quickstart.cloudera:8088/cluster/app/application_1439926335194_0001 Diagnostics: Unstable Application Instance : - failed with component MYAPP failed 'recently' 6 times (4 in startup); threshold is 5 - last failure: Failure container_1439926335194_0001_01_000008 on host quickstart.cloudera (0): http://quickstart.cloudera:19888/jobhistory/logs//quickstart.cloudera:8041/container_1439926335194_0001_01_000008/ctx/C4 Part of the challenge in diagnosing the issue with the container is that the logs disappear after the application completes. http://quickstart.cloudera:8042/node/containerlogs/container_1439926335194_0001_01_000001/MYUSER There is a troubleshooting page for slider which indicates that you can persist the logs beyond application completion: http://slider.incubator.apache.org/docs/troubleshooting.html Configuring YARN for better debugging One configuration to aid debugging is tell the nodemanagers to keep data for a short period after containers finish <!-- 10 minutes after a failure to see what is left in the directory-->
<property>
<name> yarn.nodemanager.delete.debug-delay-sec </name>
<value> 600 </value>
</property> And I found this setting in Yarn - Configuration - NodeManager Base Group - Advanced - Localized Dir Delection Delay and changed it from the default of 0 to 1200; however even after I deploy client config, and restart Nodemanager + Yarn, even restart the VM, the logs are still getting deleted on container completion. I'm working on the CDH 5.3.0 Vitrualbox VM image and the cluster + services appear to be working normally as I start up the package.
... View more
08-10-2015
02:31 PM
Using the 5.3.0 Virtualbox VM; trying to start the cluster; used to work but no longer, now getting this error message: Command 'Start' failed for cluster 'Cloudera QuickStart' Command 'ZkStartPreservingDatastore' failed for service 'zookeeper' Zookeeper appears to be running: [cloudera@quickstart ~]$ sudo service zookeeper-server start JMX enabled by default Using config: /etc/zookeeper/conf/zoo.cfg Starting zookeeper ... already running as process 23023. Not seeing any obvious exceptions in the zookeeper logs. Then tried starting up hbase and that failed with this error: Service did not start successfully; not all of the required roles started: Service has only 0 Master roles running instead of minimum required 1.
... View more
07-14-2015
11:25 AM
1 Kudo
I'm on the 5.3.0 VM now and was able to locate the jar in the folder you identified, thanks /usr/lib/hadoop-mapreduce/
... View more
07-14-2015
11:00 AM
So I built the distributed shell example and am using it to run a jar file that does some processing and writes several .json files out to a results folder. I'm passing the path to the script (run.sh) in the shell_script parameter. The script starts out with: /usr/java/latest/bin/java -Xmx2G -classpath /<path>/MyMain.jar:<more jars> The script had been using relative references to the main jar and all the dependencies but it wasn't finding them so I made the paths absolute. If I run the script from the command line in the folder where it resides it will create the results folder there. If I use a container to execute, then I see through logs (from MyMain.jar) that the results folder is being created in appCache but the folder is gone when processing completes; something like this: /yarn/nm/usercache/cloudera/appcache/<applicationId>/<containerId>/results I want to be able to run multiple containers in parallel, with all sharing the same main jar and dependencies, but each processing through a different set of input files and writing to a separate results folder. Is there a way to run the container in the context of the folder where the script is so I can keep relative paths instead of using absolute paths? Do I hook the container complete event and pull the results folder out of cache there or is there a better way that is already built into the infrastructure? Distributed Shell Example source on GitHub (Crosspost to StackOverflow)
... View more
05-05-2015
04:05 PM
5 Kudos
Running the CDH 5.3.0 Quickstart VM; submitted a MR job and it was just sitting in YARN with no indication of progress. http://quickstart.cloudera:8088/cluster/ Cloudera Manager shows lots of systems with log out of space errors so I looked through the filesystem for where I could free up space; found what appears to be excessive usage in the YARN usercache folder (6 GB in use). File location is: /yarn/nm/usercache/[user]/filecache Can I just delete the cache manually or is there a better way? Is there a way to configure YARN to use less space?
... View more
02-09-2015
02:36 PM
Yes I've looked there as my question indicated. Why wouldn't I see job history?
... View more
02-06-2015
03:36 PM
Looking for a REST API to get jobs history. I don't see any jobs listed in the web UI even though I've had lots of activity: http://quickstart.cloudera:19888/jobhistory Logged in as: dr.who Do I need to do authentication? I navigated around and found a rest for cluster info, found the default node under it but no applications running.
... View more
07-17-2014
01:03 PM
Wanted to try out some python on hadoop in my CDH5 VM but I need the streaming jar file. According to the documentation here http://www.cloudera.com/content/cloudera-content/cloudera-docs/CDH5/latest/CDH5-Installation-Guide/c... the jar should be at this location: /usr/lib/hadoop-0.20-mapreduce/contrib/streaming/ hadoop-streaming-2.0.0-mr1-cdh<version>.jar o r maybe here /usr/lib/hadoop-mapreduce/ hadoop-streaming.jar but I don't see it at either place. is there a way to install additional components if they are missing?
... View more
06-19-2014
12:59 PM
CM had been working but is now failing to start; appears to start but when I check status it shows as dead. The CM Web Admin Page shows the Loading message spinning with no resolution. Starting via command line appears to work: sudo service cloudera-scm-server start Starting cloudera-scm-server: [ OK ] But when I check status, server is dead: [cloudera@localhost ~]$ sudo service --status-all | grep cloudera cloudera-scm-agent (pid 9451) is running... cloudera-scm-server dead but pid file exists Stack trace in log (/var/log/cloudera-scm-server/cloudera-scm-server.log) shows: 2014-06-19 10:13:41,129 WARN [EventStorePublisherWithRetry-0:publish.EventStorePublisherWithRetry@195] Failed to publish event: SimpleEvent{attributes={CLUSTER_ID=[1], SEVERITY=[INFORMATIONAL], SERVICE=[ClouderaManager], COMMAND_ID=[789], COMMAND_RESULT=[Command 'Stop' failed for cluster 'Cloudera QuickStart - C5'], ALERT=[false], USER=[clouderaManager], MESSAGE_CODES=[COMMAND_FAILED_WITH_TARGET], CATEGORY=[AUDIT_EVENT], CLUSTER=[Cloudera QuickStart - C5], COMMAND_STATUS=[FAILED], COMMAND_ARGS=[RestartClusterCmdArgs{restartOnlyStaleServices=false, redeployClientConfig=false}], SERVICE_TYPE=[ManagerServer], COMMAND=[Restart], EVENTCODE=[EV_CLUSTER_RESTARTED]}, content=Command Restart on subject Cloudera QuickStart - C5 failed., timestamp=1403196216196} - 1 of 338 failure(s) in last 1804s java.io.IOException: Error connecting to unknown0800270ffe88:7184 at org.apache.avro.ipc.NettyTransceiver.getChannel(NettyTransceiver.java:249) at org.apache.avro.ipc.NettyTransceiver.<init>(NettyTransceiver.java:198) So looks like it is attempting to connect to port 7184; this page shows that to be the Cloudera Manager Event Server: hthttp://www.cloudera.com/content/cloudera-content/cloudera-docs/CM5/latest/Cloudera-Manager-Installation-Guide/cm5ig_ports_cm.html Checking the port I see nothing is running on it: sudo netstat -tulpn | grep 7184 Not sure if it is related but I installed Mongo on the box, 2.4 previously (and CM was working) and recently upgraded to 2.6.2 and then CM stopped working.
... View more