Member since
05-04-2014
6
Posts
0
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2821 | 05-28-2014 09:46 AM |
06-07-2014
01:18 PM
Hi, I am trying to decommision an HDFS host but the process just hangs and the node failes to decommission. I'm not sure how to go about troubleshooting this problem. Any suggestions? I've tried shutting down the service manually but I can't figure out how to. There is nothing in the /etc/init.d that seems to be related to the HDFS service. I should mention that the real problem is that even though I have YARN and HDFS running on that host, no jobs are shceduled to it. So I'm trying to reinstall everything to see if that fixed things.
... View more
05-28-2014
09:46 AM
Hi, I figured out what the problem was. The parcel was not being downloaded on the cloudera manager host because there was not enough disk space on that. However, after I changed the path and the parcel was successfully downloaded. It is fixed now thanks!
... View more
05-28-2014
03:08 AM
Hi, I'm trying to install CDH on my cluster through Coudera Manager. However, while trying to download the parcels, I get this error: However, df -h shows me this: Filesystem Size Used Avail Use% Mounted on rootfs 19G 3.0G 15G 17% / udev 10M 0 10M 0% /dev tmpfs 201M 224K 201M 1% /run /dev/disk/by-uuid/1a724f4b-1c04-45d5-ad66-979208a20ba1 19G 3.0G 15G 17% / tmpfs 5.0M 0 5.0M 0% /run/lock tmpfs 578M 0 578M 0% /run/shm cm_processes 1003M 0 1003M 0% /run/cloudera-scm-agent/process There seems to be ample space on the device. Any suggestions?
... View more
05-28-2014
01:43 AM
Hi bgooley, Thank you for your reply. I did as you suggested and here are the stderr and stdout from the inspection: stderr: + DDL_DIR=/usr/share/cmf/schema
+ [[ inspector == \f\i\r\e\h\o\s\e ]]
+ [[ inspector == \e\v\e\n\t\s\e\r\v\e\r ]]
+ [[ inspector == \a\l\e\r\t\p\u\b\l\i\s\h\e\r ]]
+ [[ inspector == \h\e\a\d\l\a\m\p ]]
+ [[ inspector == \i\n\s\p\e\c\t\o\r ]]
+ shift
++ pwd
+ MGMT_CLASSPATH='/run/cloudera-scm-agent/process/5-host-inspector:/usr/share/java/mysql-connector-java.jar:/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/cmf/lib/*'
+ echo_and_exec /usr/lib/jvm/java-7-oracle-cloudera/bin/java -server -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:+UseParNewGC -Dmgmt.log.file= -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -cp '/run/cloudera-scm-agent/process/5-host-inspector:/usr/share/java/mysql-connector-java.jar:/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/cmf/lib/*' com.cloudera.cmf.inspector.Inspector input.json output.json DEFAULT
+ echo 'Executing: /usr/lib/jvm/java-7-oracle-cloudera/bin/java' -server -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:+UseParNewGC -Dmgmt.log.file= -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -cp '/run/cloudera-scm-agent/process/5-host-inspector:/usr/share/java/mysql-connector-java.jar:/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/cmf/lib/*' com.cloudera.cmf.inspector.Inspector input.json output.json DEFAULT
+ exec /usr/lib/jvm/java-7-oracle-cloudera/bin/java -server -XX:+UseConcMarkSweepGC -XX:-CMSConcurrentMTEnabled -XX:+UseParNewGC -Dmgmt.log.file= -Djava.awt.headless=true -Djava.net.preferIPv4Stack=true -cp '/run/cloudera-scm-agent/process/5-host-inspector:/usr/share/java/mysql-connector-java.jar:/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar:/usr/share/java/oracle-connector-java.jar:/usr/share/cmf/lib/*' com.cloudera.cmf.inspector.Inspector input.json output.json DEFAULT stdout: [ main] ExtantInitdInspection INFO Could not list directory /etc/init.d/rc2.d
[ main] ExtantInitdInspection WARN Skipping file README.
[ main] ExtantInitdInspection INFO Could not list directory /etc/init.d/rc3.d
[ main] ExtantInitdInspection WARN Skipping file README.
[ main] ExtantInitdInspection INFO Could not list directory /etc/init.d/rc4.d
[ main] ExtantInitdInspection WARN Skipping file README.
[ main] ExtantInitdInspection INFO Could not list directory /etc/init.d/rc5.d
[ main] Inspector INFO Running inspection: com.cloudera.cmf.inspector.TransparentHugePagesInspection@9896c52
[ main] Inspector INFO Running inspection: com.cloudera.cmf.inspector.SwappinessInspection@1d268062
{
"allHostDnsErrors" : [ ],
"allHostDnsSuccesses" : [ 1 ],
"allHostsDnsAvgDurationMillis" : 1,
"allHostsDnsCount" : 2,
"allHostsDnsMaxDurationMillis" : 2,
"etcHostsError" : null,
"etcHostsMessages" : [ ],
"etcKrbConfMessages" : [ ],
"extantInitdErrors" : [ ],
"groupData" : "cloudera-scm:x:110:\n",
"hostDnsErrors" : [ ],
"hostname" : "hadoop-master.cccs.uwe.ac.uk",
"jceStrength" : 0,
"kernelVersion" : "3.2.0-4-amd64",
"kernelVersionException" : null,
"localHostIpError" : null,
"localhostIp" : "127.0.0.1",
"nowMillis" : 1401266251008,
"rhelRelease" : null,
"runExceptions" : [ ],
"swappiness" : "60",
"swappinessException" : null,
"timeZone" : "UTC+00:00",
"transparentHugePagesDefrag" : null,
"transparentHugePagesEnabled" : null,
"transparentHugePagesException" : null,
"userData" : "cloudera-scm:x:108:110:Cloudera Manager,,,:/var/run/cloudera-scm-server:/bin/nologin\n"
} The JSON inspection results can be found here.
... View more
05-27-2014
10:31 AM
Hi all, i'm trying to install Cloudera Express 5.0.1 on my Linux machines. Currently I have a Linux host that has cloudera manager installed. I have two VMs running on the this host, one of which I want to make a cloudera server and the other a slave. While trying to do an automatic install, I get the following error: Installation failed. Failed to receive heartbeat from agent. The agent logs on the master-to-be show the following: 27/May/2014 18:13:33 +0000] 12809 MainThread agent INFO Stopping agent... [27/May/2014 18:13:33 +0000] 12809 MainThread agent INFO No extant cgroups; unmounting any cgroup roots [27/May/2014 18:13:33 +0000] 12809 MainThread agent INFO No processes are being managed; Supervisor will shutdown. [27/May/2014 18:13:33 +0000] 12809 MainThread agent INFO Shutting down supervisord, pid 12833 [27/May/2014 18:13:34 +0000] 12809 MonitorDaemon-Reporter __init__ INFO Couldn't get supervisord metrics: process no longer exists (pid=12833) [27/May/2014 18:13:34 +0000] 12809 MainThread agent INFO waiting for process to terminate... [27/May/2014 18:13:34 +0000] 12809 MainThread agent INFO Successfully killed process with pid 12833 [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE Bus STOPPING [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('hadoop-master.cccs.uwe.ac.uk', 9000)) shut down [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE Stopped thread '_TimeoutMonitor'. [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE Bus STOPPED [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE Bus STOPPING [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('hadoop-master.cccs.uwe.ac.uk', 9000)) already shut down [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE No thread running for None. [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE Bus STOPPED [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE Bus EXITING [27/May/2014 18:13:34 +0000] 12809 MainThread _cplogging INFO [27/May/2014:18:13:34] ENGINE Bus EXITED [27/May/2014 18:13:34 +0000] 12809 MainThread agent INFO Agent exiting; caught signal 15 [27/May/2014 18:13:34 +0000] 13591 MainThread agent INFO No command line vars [27/May/2014 18:13:34 +0000] 13591 MainThread agent INFO Missing database jar: /usr/share/java/mysql-connector-java.jar (normal, if you're not using this database type) [27/May/2014 18:13:34 +0000] 13591 MainThread agent INFO Missing database jar: /usr/share/java/oracle-connector-java.jar (normal, if you're not using this database type) [27/May/2014 18:13:34 +0000] 13591 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar [27/May/2014 18:13:34 +0000] 13591 MainThread agent INFO Agent starting as pid 13591 user root(0) group root(0). [27/May/2014 18:13:34 +0000] 13591 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/cgroups [27/May/2014 18:13:36 +0000] 13591 MainThread cgroups INFO cgroup pseudofile /tmp/tmprPEj0w/cpu.rt_runtime_us does not exist, skipping [27/May/2014 18:13:36 +0000] 13591 MainThread cgroups INFO Failed to mount cgroups subsystem memory to /tmp/tmp3FjlAD, rc: 32 stderr: mount: special device cm_cgroups does not exist [27/May/2014 18:13:36 +0000] 13591 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/cpu [27/May/2014 18:13:36 +0000] 13591 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/cpuacct [27/May/2014 18:13:36 +0000] 13591 MainThread cgroups INFO Reusing /run/cloudera-scm-agent/cgroups/blkio [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO Found cgroups capabilities: {'has_memory': False, 'default_memory_limit_in_bytes': -1, 'default_blkio_weight': 1000, 'writable_cgroup_dot_procs': True, 'default_cpu_rt_runtime_us': -1, 'has_cpu': True, 'default_memory_soft_limit_in_bytes': -1, 'has_cpuacct': True, 'default_cpu_shares': 1024, 'has_blkio': True} [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO Setting up supervisord event monitor. [27/May/2014 18:13:36 +0000] 13591 MainThread filesystem_map INFO Monitored nodev filesystem types: ['nfs', 'nfs4', 'tmpfs'] [27/May/2014 18:13:36 +0000] 13591 MainThread filesystem_map INFO Using timeout of 2.000000 [27/May/2014 18:13:36 +0000] 13591 MainThread filesystem_map INFO Using join timeout of 0.100000 [27/May/2014 18:13:36 +0000] 13591 MainThread filesystem_map INFO Using tolerance of 60.000000 [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO Using metrics_url_timeout_seconds of 30.000000 [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO Using task_metrics_timeout_seconds of 5.000000 [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO Using max_collection_wait_seconds of 10.000000 [27/May/2014 18:13:36 +0000] 13591 MainThread metrics INFO Importing tasktracker metric schema from file /usr/lib/cmf/agent/src/cmf/monitor/tasktracker/schema.json [27/May/2014 18:13:36 +0000] 13591 MainThread dns_names INFO Using timeout of 2.000000 [27/May/2014 18:13:36 +0000] 13591 MainThread ntp_monitor INFO Using timeout of 2.000000 [27/May/2014 18:13:36 +0000] 13591 MainThread __init__ INFO Importing metric schema from file /usr/lib/cmf/agent/src/cmf/monitor/schema.json [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO Supervised processes will add the following to their environment (in addition to the supervisor's env): {'CDH_PARQUET_HOME': '/usr/lib/parquet', 'CDH_OOZIE_HOME': '/usr/lib/oozie', 'CDH_MR2_HOME': '/usr/lib/hadoop-mapreduce', 'CDH_ZOOKEEPER_HOME': '/usr/lib/zookeeper', 'CDH_HADOOP_BIN': '/usr/bin/hadoop', 'MGMT_HOME': '/usr/share/cmf', 'CDH_IMPALA_HOME': '/usr/lib/impala', 'CLOUDERA_MYSQL_CONNECTOR_JAR': '/usr/share/java/mysql-connector-java.jar', 'CDH_YARN_HOME': '/usr/lib/hadoop-yarn', 'CMF_PACKAGE_DIR': '/usr/lib/cmf/service', 'CDH_SPARK_HOME': '/usr/lib/spark', 'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/bin/X11', 'CDH_HDFS_HOME': '/usr/lib/hadoop-hdfs', 'CDH_SOLR_HOME': '/usr/lib/solr', 'CDH_PIG_HOME': '/usr/lib/pig', 'CDH_SQOOP2_HOME': '/usr/lib/sqoop2', 'CDH_HUE_PLUGINS_HOME': '/usr/lib/hadoop', 'CM_STATUS_CODES': u'STATUS_NONE HDFS_DFS_DIR_NOT_EMPTY HBASE_TABLE_DISABLED HBASE_TABLE_ENABLED JOBTRACKER_IN_STANDBY_MODE YARN_RM_IN_STANDBY_MODE', 'CDH_MR1_HOME': '/usr/lib/hadoop-0.20-mapreduce', 'CLOUDERA_ORACLE_CONNECTOR_JAR': '/usr/share/java/oracle-connector-java.jar', 'CDH_HUE_HOME': '/usr/lib/hue', 'CDH_CRUNCH_HOME': '/usr/lib/crunch', 'CDH_HIVE_HOME': '/usr/lib/hive', 'CDH_HTTPFS_HOME': '/usr/lib/hadoop-httpfs', 'CDH_HADOOP_HOME': '/usr/lib/hadoop', 'JSVC_HOME': '/usr/libexec/bigtop-utils', 'HIVE_DEFAULT_XML': '/etc/hive/conf.dist/hive-default.xml', 'WEBHCAT_DEFAULT_XML': '/etc/hive-webhcat/conf.dist/webhcat-default.xml', 'CLOUDERA_POSTGRESQL_JDBC_JAR': '/usr/share/cmf/lib/postgresql-9.0-801.jdbc4.jar', 'CDH_HBASE_INDEXER_HOME': '/usr/lib/hbase-solr', 'CDH_FLUME_HOME': '/usr/lib/flume-ng', 'TOMCAT_HOME': '/usr/lib/bigtop-tomcat', 'CDH_HBASE_HOME': '/usr/lib/hbase', 'CDH_SQOOP_HOME': '/usr/lib/sqoop', 'CDH_HCAT_HOME': '/usr/lib/hive-hcatalog', 'CDH_LLAMA_HOME': '/usr/lib/llama/'} [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels. [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/process [27/May/2014 18:13:36 +0000] 13591 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor "/var/log/cloudera-scm-agent/cloudera-scm-agent.log" [readonly] 105L, 12540C [27/May/2014 18:13:36 +0000] 13591 MainThread agent ERROR Failed to connect to previous supervisor. Traceback (most recent call last): File "/usr/lib/cmf/agent/src/cmf/agent.py", line 1236, in find_or_start_supervisor self.get_supervisor_process_info() File "/usr/lib/cmf/agent/src/cmf/agent.py", line 1423, in get_supervisor_process_info self.identifier = self.supervisor_client.supervisor.getIdentification() File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1578, in __request verbose=self.__verbose File "/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/supervisor-3.0-py2.7.egg/supervisor/xmlrpc.py", line 460, in request self.connection.request('POST', handler, request_body, self.headers) File "/usr/lib/python2.7/httplib.py", line 962, in request self._send_request(method, url, body, headers) File "/usr/lib/python2.7/httplib.py", line 996, in _send_request self.endheaders(body) File "/usr/lib/python2.7/httplib.py", line 958, in endheaders self._send_output(message_body) File "/usr/lib/python2.7/httplib.py", line 818, in _send_output self.send(msg) File "/usr/lib/python2.7/httplib.py", line 780, in send self.connect() File "/usr/lib/python2.7/httplib.py", line 761, in connect self.timeout, self.source_address) File "/usr/lib/python2.7/socket.py", line 571, in create_connection raise err error: [Errno 111] Connection refused [27/May/2014 18:13:36 +0000] 13591 MainThread tmpfs INFO Reusing mounted tmpfs at /run/cloudera-scm-agent/process [27/May/2014 18:13:38 +0000] 13591 MainThread agent INFO Trying to connect to newly launched supervisor (Attempt 1) [27/May/2014 18:13:38 +0000] 13591 MainThread agent INFO Successfully connected to supervisor [27/May/2014 18:13:38 +0000] 13591 MainThread _cplogging INFO [27/May/2014:18:13:38] ENGINE Bus STARTING [27/May/2014 18:13:38 +0000] 13591 MainThread _cplogging INFO [27/May/2014:18:13:38] ENGINE Started monitor thread '_TimeoutMonitor'. [27/May/2014 18:13:38 +0000] 13591 MainThread _cplogging INFO [27/May/2014:18:13:38] ENGINE Serving on hadoop-master.cccs.uwe.ac.uk:9000 [27/May/2014 18:13:38 +0000] 13591 MainThread _cplogging INFO [27/May/2014:18:13:38] ENGINE Bus STARTED [27/May/2014 18:13:38 +0000] 13591 MainThread __init__ INFO New monitor: (<cmf.monitor.host.HostMonitor object at 0x1c86950>,) [27/May/2014 18:13:38 +0000] 13591 MainThread agent WARNING Setting default socket timeout to 30! [27/May/2014 18:13:38 +0000] 13591 MonitorDaemon-Scheduler __init__ INFO Monitor ready to report: ('HostMonitor',) [27/May/2014 18:13:38 +0000] 13591 MainThread agent INFO Using parcels directory from server provided value: /opt/cloudera/parcels [27/May/2014 18:13:38 +0000] 13591 MainThread parcel INFO Agent does create users/groups and apply file permissions [27/May/2014 18:13:38 +0000] 13591 MainThread downloader INFO Downloader path: /opt/cloudera/parcel-cache [27/May/2014 18:13:38 +0000] 13591 MainThread parcel_cache INFO Using /opt/cloudera/parcel-cache for parcel cache [27/May/2014 18:13:38 +0000] 13591 MainThread agent INFO Active parcel list updated; recalculating component info. [27/May/2014 18:13:43 +0000] 13591 Monitor-HostMonitor throttling_logger INFO Using java location: '/usr/lib/jvm/java-7-oracle-cloudera/bin/java'. [27/May/2014 18:13:43 +0000] 13591 Monitor-HostMonitor throttling_logger ERROR Failed to collect NTP metrics Traceback (most recent call last): File "/usr/lib/cmf/agent/src/cmf/monitor/host/ntp_monitor.py", line 39, in collect result, stdout, stderr = self._subprocess_with_timeout(args, self._timeout) File "/usr/lib/cmf/agent/src/cmf/monitor/host/ntp_monitor.py", line 32, in _subprocess_with_timeout return subprocess_with_timeout(args, timeout) File "/usr/lib/cmf/agent/src/cmf/monitor/host/subprocess_timeout.py", line 40, in subprocess_with_timeout close_fds=True) File "/usr/lib/python2.7/subprocess.py", line 679, in __init__ errread, errwrite) File "/usr/lib/python2.7/subprocess.py", line 1259, in _execute_child raise child_exception OSError: [Errno 2] No such file or directory I did a forum search for this topic and other people have had problems, but their logs files are different from mine. Any idea what's going here?
... View more