Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Installation failed. Failed to receive heartbeat from agent

Installation failed. Failed to receive heartbeat from agent

New Contributor

Hi Team,

 

    Need your help in solving the following error. I am trying to setup a 3 node cluster .

 

Error

***************************************************************************************************************************

Installation failed. Failed to receive heartbeat from agent.

  • Ensure that the host's hostname is configured properly.
  • Ensure that port 7182 is accessible on the Cloudera Manager Server (check firewall rules).
  • Ensure that ports 9000 and 9001 are not in use on the host being added.
  • Check agent logs in /var/log/cloudera-scm-agent/ on the host being added. (Some of the logs can be found in the installation details).
  • If Use TLS Encryption for Agents is enabled in Cloudera Manager (Administration -> Settings -> Security), ensure that /etc/cloudera-scm-agent/config.ini has use_tls=1 on the host being added. Restart the corresponding agent and click the Retry link here.

****************************************************************************************************************************

 

I tried the following but no luck yet.

 

1. Checked the Hostname and found to be correct 

 

root@lon-ref-iot-portal-1:~# python -c 'import socket; print socket.getfqdn(),
> socket.gethostbyname(socket.getfqdn())'
lon-ref-iot-portal-1
root@lon-ref-iot-portal-1:~# hostname
lon-ref-iot-portal-1

 

2. Checked Port 9000 and 9001

 

netstat -apn | grep 9001
tcp 0 0 127.0.0.1:59285 127.0.0.1:19001 TIME_WAIT -

netstat -apn | grep 9000
root@lon-ref-iot-portal-1:~#

 

3. Here is how the /etc/hosts looks like

 

#127.0.0.1 localhost
127.0.0.1 localdomain localhost
#127.0.1.1 localhost.cs1cloud.internal localhost
10.50.0.153 lon-ref-iot-portal-1 portal-1
10.50.0.42 lon-ref-iot-mongo-2 mongo-2
10.50.0.197 lon-ref-iot-mongo-3 mongo-3
#10.50.0.153 kafka
# The following lines are desirable for IPv6 capable hosts
::1 ip6-localhost ip6-loopback
fe00::0 ip6-localnet
ff00::0 ip6-mcastprefix
ff02::1 ip6-allnodes
iff02::2 ip6-allrouters

 

4. Here is the Log file

 
agent logs: 
BEGIN tail -n 50 /var/log/cloudera-scm-agent//cloudera-scm-agent.out | sed 's/^/>>/' 
>>/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/redaction.py:260: SyntaxWarning: name 'REDACTOR' is assigned to before global declaration 
>> global REDACTOR 
>>/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/redaction.py:261: SyntaxWarning: name 'NOP_REDACTOR' is assigned to before global declaration 
>> global NOP_REDACTOR 
>>[13/Jul/2018 07:28:57 +0000] 22163 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 07:28:57 +0000] 22163 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 07:28:57 +0000] 22163 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 08:38:56 +0000] 59290 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 08:38:56 +0000] 59290 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 08:38:56 +0000] 59290 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 08:43:39 +0000] 19447 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 08:43:39 +0000] 19447 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 08:43:39 +0000] 19447 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 08:53:07 +0000] 3653 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 08:53:07 +0000] 3653 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 08:53:07 +0000] 3653 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 09:56:59 +0000] 9489 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 09:56:59 +0000] 9489 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 09:56:59 +0000] 9489 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:10:26 +0000] 14756 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:10:26 +0000] 14756 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:10:26 +0000] 14756 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:18:57 +0000] 59097 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:18:57 +0000] 59097 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:18:57 +0000] 59097 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:23:55 +0000] 20019 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:23:55 +0000] 20019 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:23:55 +0000] 20019 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:25:19 +0000] 27896 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:25:19 +0000] 27896 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:25:19 +0000] 27896 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:57:42 +0000] 426 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:57:42 +0000] 426 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:57:42 +0000] 426 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:07:21 +0000] 51127 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:07:21 +0000] 51127 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:07:21 +0000] 51127 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:10:13 +0000] 701 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:10:13 +0000] 701 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:10:13 +0000] 701 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:10:41 +0000] 3889 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:10:41 +0000] 3889 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:10:41 +0000] 3889 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:18:56 +0000] 47262 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:18:56 +0000] 47262 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:18:56 +0000] 47262 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:21:08 +0000] 59122 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:21:08 +0000] 59122 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:21:08 +0000] 59122 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
END (0) 
BEGIN tail -n 50 /var/log/cloudera-scm-agent//cloudera-scm-agent.log | sed 's/^/>>/' 
>>/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/redaction.py:260: SyntaxWarning: name 'REDACTOR' is assigned to before global declaration 
>> global REDACTOR 
>>/usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/redaction.py:261: SyntaxWarning: name 'NOP_REDACTOR' is assigned to before global declaration 
>> global NOP_REDACTOR 
>>[13/Jul/2018 07:28:57 +0000] 22163 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 07:28:57 +0000] 22163 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 07:28:57 +0000] 22163 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 08:38:56 +0000] 59290 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 08:38:56 +0000] 59290 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 08:38:56 +0000] 59290 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 08:43:39 +0000] 19447 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 08:43:39 +0000] 19447 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 08:43:39 +0000] 19447 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 08:53:07 +0000] 3653 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 08:53:07 +0000] 3653 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 08:53:07 +0000] 3653 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 09:56:59 +0000] 9489 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 09:56:59 +0000] 9489 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 09:56:59 +0000] 9489 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:10:26 +0000] 14756 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:10:26 +0000] 14756 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:10:26 +0000] 14756 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:18:57 +0000] 59097 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:18:57 +0000] 59097 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:18:57 +0000] 59097 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:23:55 +0000] 20019 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:23:55 +0000] 20019 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:23:55 +0000] 20019 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:25:19 +0000] 27896 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:25:19 +0000] 27896 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:25:19 +0000] 27896 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 10:57:42 +0000] 426 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 10:57:42 +0000] 426 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 10:57:42 +0000] 426 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:07:21 +0000] 51127 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:07:21 +0000] 51127 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:07:21 +0000] 51127 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:10:13 +0000] 701 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:10:13 +0000] 701 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:10:13 +0000] 701 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:10:41 +0000] 3889 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:10:41 +0000] 3889 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:10:41 +0000] 3889 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:18:56 +0000] 47262 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:18:56 +0000] 47262 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:18:56 +0000] 47262 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:21:08 +0000] 59122 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:21:08 +0000] 59122 MainThread agent WARNING Expected mode 0751 for /run/cloudera-scm-agent but was 0755 
>>[13/Jul/2018 11:21:08 +0000] 59122 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Found cgroups capabilities: {'has_memory': True, 'default_memory_limit_in_bytes': -1, 'default_memory_soft_limit_in_bytes': -1, 'writable_cgroup_dot_procs': True, 'default_cpu_rt_runtime_us': -1, 'has_cpu': True, 'default_blkio_weight': 1000, 'default_cpu_shares': 1024, 'has_cpuacct': True, 'has_blkio': True} 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Setting up supervisord event monitor. 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Monitored nodev filesystem types: ['nfs', 'nfs4', 'tmpfs'] 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Using timeout of 2.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Using join timeout of 0.100000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Using tolerance of 60.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Local filesystem types whitelist: ['ext2', 'ext3', 'ext4', 'xfs'] 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread kt_renewer INFO Agent wide credential cache set to /run/cloudera-scm-agent/krb5cc_cm_agent_0 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Using metrics_url_timeout_seconds of 30.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Using task_metrics_timeout_seconds of 5.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Using max_collection_wait_seconds of 10.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread metrics INFO Importing tasktracker metric schema from file /usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/monitor/tasktracker/schema.json 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread ntp_monitor INFO Using timeout of 2.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread dns_names INFO Using timeout of 30.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread __init__ INFO Created DNS monitor. 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread stacks_collection_manager INFO Using max_uncompressed_file_size_bytes: 5242880 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread __init__ INFO Importing metric schema from file /usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/monitor/schema.json 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Supervised processes will add the following to their environment (in addition to the supervisor's env): {'CDH_PARQUET_HOME': '/usr/lib/parquet', 'JSVC_HOME': '/usr/libexec/bigtop-utils', 'CMF_PACKAGE_DIR': '/usr/lib/cmf/service', 'CDH_HADOOP_BIN': '/usr/bin/hadoop', 'MGMT_HOME': '/usr/share/cmf', 'CDH_IMPALA_HOME': '/usr/lib/impala', 'CDH_YARN_HOME': '/usr/lib/hadoop-yarn', 'CDH_HDFS_HOME': '/usr/lib/hadoop-hdfs', 'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games', 'CDH_HUE_PLUGINS_HOME': '/usr/lib/hadoop', 'CM_STATUS_CODES': u'STATUS_NONE HDFS_DFS_DIR_NOT_EMPTY HBASE_TABLE_DISABLED HBASE_TABLE_ENABLED JOBTRACKER_IN_STANDBY_MODE YARN_RM_IN_STANDBY_MODE', 'KEYTRUSTEE_KP_HOME': '/usr/share/keytrustee-keyprovider', 'CLOUDERA_ORACLE_CONNECTOR_JAR': '/usr/share/java/oracle-connector-java.jar', 'CDH_SQOOP2_HOME': '/usr/lib/sqoop2', 'KEYTRUSTEE_SERVER_HOME': '/usr/lib/keytrustee-server', 'CDH_MR2_HOME': '/usr/lib/hadoop-mapreduce', 'HIVE_DEFAULT_XML': '/etc/hive/conf.dist/hive-default.xml', 'CLOUDERA_POSTGRESQL_JDBC_JAR': '/usr/share/cmf/lib/postgresql-42.1.4.jre7.jar', 'CDH_KMS_HOME': '/usr/lib/hadoop-kms', 'CDH_HBASE_HOME': '/usr/lib/hbase', 'CDH_SQOOP_HOME': '/usr/lib/sqoop', 'WEBHCAT_DEFAULT_XML': '/etc/hive-webhcat/conf.dist/webhcat-default.xml', 'CDH_OOZIE_HOME': '/usr/lib/oozie', 'CDH_ZOOKEEPER_HOME': '/usr/lib/zookeeper', 'CDH_HUE_HOME': '/usr/lib/hue', 'CLOUDERA_MYSQL_CONNECTOR_JAR': '/usr/share/java/mysql-connector-java.jar', 'CDH_HBASE_INDEXER_HOME': '/usr/lib/hbase-solr', 'CDH_MR1_HOME': '/usr/lib/hadoop-0.20-mapreduce', 'CDH_SOLR_HOME': >>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Found cgroups capabilities: {'has_memory': True, 'default_memory_limit_in_bytes': -1, 'default_memory_soft_limit_in_bytes': -1, 'writable_cgroup_dot_procs': True, 'default_cpu_rt_runtime_us': -1, 'has_cpu': True, 'default_blkio_weight': 1000, 'default_cpu_shares': 1024, 'has_cpuacct': True, 'has_blkio': True} 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Setting up supervisord event monitor. 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Monitored nodev filesystem types: ['nfs', 'nfs4', 'tmpfs'] 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Using timeout of 2.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Using join timeout of 0.100000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Using tolerance of 60.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread filesystem_map INFO Local filesystem types whitelist: ['ext2', 'ext3', 'ext4', 'xfs'] 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread kt_renewer INFO Agent wide credential cache set to /run/cloudera-scm-agent/krb5cc_cm_agent_0 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Using metrics_url_timeout_seconds of 30.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Using task_metrics_timeout_seconds of 5.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread agent INFO Using max_collection_wait_seconds of 10.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread metrics INFO Importing tasktracker metric schema from file /usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/monitor/tasktracker/schema.json 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread ntp_monitor INFO Using timeout of 2.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread dns_names INFO Using timeout of 30.000000 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread __init__ INFO Created DNS monitor. 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread stacks_collection_manager INFO Using max_uncompressed_file_size_bytes: 5242880 
>>[13/Jul/2018 11:18:56 +0000] 47316 MainThread __init__ INFO Importing metric schema from file /usr/lib/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.15.0-py2.7.egg/cmf/monitor/schema.json 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Supervised processes will add the following to their environment (in addition to the supervisor's env): {'CDH_PARQUET_HOME': '/usr/lib/parquet', 'JSVC_HOME': '/usr/libexec/bigtop-utils', 'CMF_PACKAGE_DIR': '/usr/lib/cmf/service', 'CDH_HADOOP_BIN': '/usr/bin/hadoop', 'MGMT_HOME': '/usr/share/cmf', 'CDH_IMPALA_HOME': '/usr/lib/impala', 'CDH_YARN_HOME': '/usr/lib/hadoop-yarn', 'CDH_HDFS_HOME': '/usr/lib/hadoop-hdfs', 'PATH': '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games', 'CDH_HUE_PLUGINS_HOME': '/usr/lib/hadoop', 'CM_STATUS_CODES': u'STATUS_NONE HDFS_DFS_DIR_NOT_EMPTY HBASE_TABLE_DISABLED HBASE_TABLE_ENABLED JOBTRACKER_IN_STANDBY_MODE YARN_RM_IN_STANDBY_MODE', 'KEYTRUSTEE_KP_HOME': '/usr/share/keytrustee-keyprovider', 'CLOUDERA_ORACLE_CONNECTOR_JAR': '/usr/share/java/oracle-connector-java.jar', 'CDH_SQOOP2_HOME': '/usr/lib/sqoop2', 'KEYTRUSTEE_SERVER_HOME': '/usr/lib/keytrustee-server', 'CDH_MR2_HOME': '/usr/lib/hadoop-mapreduce', 'HIVE_DEFAULT_XML': '/etc/hive/conf.dist/hive-default.xml', 'CLOUDERA_POSTGRESQL_JDBC_JAR': '/usr/share/cmf/lib/postgresql-42.1.4.jre7.jar', 'CDH_KMS_HOME': '/usr/lib/hadoop-kms', 'CDH_HBASE_HOME': '/usr/lib/hbase', 'CDH_SQOOP_HOME': '/usr/lib/sqoop', 'WEBHCAT_DEFAULT_XML': '/etc/hive-webhcat/conf.dist/webhcat-default.xml', 'CDH_OOZIE_HOME': '/usr/lib/oozie', 'CDH_ZOOKEEPER_HOME': '/usr/lib/zookeeper', 'CDH_HUE_HOME': '/usr/lib/hue', 'CLOUDERA_MYSQL_CONNECTOR_JAR': '/usr/share/java/mysql-connector-java.jar', 'CDH_HBASE_INDEXER_HOME': '/usr/lib/hbase-solr', 'CDH_MR1_HOME': '/usr/lib/hadoop-0.20-mapreduce', 'CDH_SOLR_HOME': '/usr/lib/solr', 'CDH_PIG_HOME': '/usr/lib/pig', 'CDH_SENTRY_HOME': '/usr/lib/sentry', 'CDH_CRUNCH_HOME': '/usr/lib/crunch', 'CDH_LLAMA_HOME': '/usr/lib/llama/', 'CDH_HTTPFS_HOME': '/usr/lib/hadoop-httpfs', 'CDH_HADOOP_HOME': '/usr/lib/hadoop', 'CDH_HIVE_HOME': '/usr/lib/hive', 'ORACLE_HOME': '/usr/share/oracle/instantclient', 'CDH_HCAT_HOME': '/usr/lib/hive-hcatalog', 'CDH_KAFKA_HOME': '/usr/lib/kafka', 'CDH_SPARK_HOME': '/usr/lib/spark', 'TOMCAT_HOME': '/usr/lib/bigtop-tomcat', 'CDH_FLUME_HOME': '/usr/lib/flume-ng'} 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/process 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/flood 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor/include 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Supervisor version: 3.0, pid: 22682 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Connecting to previous supervisor: agent-22163-1531463337. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread status_server INFO Using maximum impala profile bundle size of 1073741824 bytes. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread status_server INFO Using maximum stacks log bundle size of 1073741824 bytes. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread _cplogging INFO [13/Jul/2018:11:18:57] ENGINE Bus STARTING 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread _cplogging INFO [13/Jul/2018:11:18:57] ENGINE Started monitor thread '_TimeoutMonitor'. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread _cplogging INFO [13/Jul/2018:11:18:57] ENGINE Serving on lon-ref-iot-portal-1:9000 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread _cplogging INFO [13/Jul/2018:11:18:57] ENGINE Bus STARTED 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread __init__ INFO New monitor: (<cmf.monitor.host.HostMonitor object at 0x7f6f0aec3ed0>,) 
>>[13/Jul/2018 11:18:57 +0000] 47316 MonitorDaemon-Scheduler __init__ INFO Monitor ready to report: ('HostMonitor',) 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Setting default socket timeout to 45 
>>[13/Jul/2018 11:18:57 +0000] 47316 Monitor-HostMonitor network_interfaces INFO NIC iface docker0 doesn't support ETHTOOL (95) 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread heartbeat_tracker INFO HB stats (seconds): num:1 LIFE_MIN:0.02 min:0.02 mean:0.02 max:0.02 LIFE_MAX:0.02 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread __init__ INFO Agent UUID file was last modified at 2018-07-13 07:28:57.433583 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO ================================================================================ 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Agent Protocol Version: 4 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Using Host ID: 18798eec-35c8-4581-a872-ccae3168c7f6 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Using directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Using supervisor binary path: /usr/lib/cmf/agent/build/env/bin/supervisord 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Neither verify_cert_file nor verify_cert_dir are configured. Not performing validation of server certificates in HTTPS communication. These options can be configured in this agent's config.ini file to en'/usr/lib/solr', 'CDH_PIG_HOME': '/usr/lib/pig', 'CDH_SENTRY_HOME': '/usr/lib/sentry', 'CDH_CRUNCH_HOME': '/usr/lib/crunch', 'CDH_LLAMA_HOME': '/usr/lib/llama/', 'CDH_HTTPFS_HOME': '/usr/lib/hadoop-httpfs', 'CDH_HADOOP_HOME': '/usr/lib/hadoop', 'CDH_HIVE_HOME': '/usr/lib/hive', 'ORACLE_HOME': '/usr/share/oracle/instantclient', 'CDH_HCAT_HOME': '/usr/lib/hive-hcatalog', 'CDH_KAFKA_HOME': '/usr/lib/kafka', 'CDH_SPARK_HOME': '/usr/lib/spark', 'TOMCAT_HOME': '/usr/lib/bigtop-tomcat', 'CDH_FLUME_HOME': '/usr/lib/flume-ng'} 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/process 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/flood 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Re-using pre-existing directory: /run/cloudera-scm-agent/supervisor/include 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Supervisor version: 3.0, pid: 22682 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Connecting to previous supervisor: agent-22163-1531463337. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread status_server INFO Using maximum impala profile bundle size of 1073741824 bytes. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread status_server INFO Using maximum stacks log bundle size of 1073741824 bytes. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread _cplogging INFO [13/Jul/2018:11:18:57] ENGINE Bus STARTING 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread _cplogging INFO [13/Jul/2018:11:18:57] ENGINE Started monitor thread '_TimeoutMonitor'. 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread _cplogging INFO [13/Jul/2018:11:18:57] ENGINE Serving on lon-ref-iot-portal-1:9000 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread _cplogging INFO [13/Jul/2018:11:18:57] ENGINE Bus STARTED 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread __init__ INFO New monitor: (<cmf.monitor.host.HostMonitor object at 0x7f6f0aec3ed0>,) 
>>[13/Jul/2018 11:18:57 +0000] 47316 MonitorDaemon-Scheduler __init__ INFO Monitor ready to report: ('HostMonitor',) 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread agent INFO Setting default socket timeout to 45 
>>[13/Jul/2018 11:18:57 +0000] 47316 Monitor-HostMonitor network_interfaces INFO NIC iface docker0 doesn't support ETHTOOL (95) 
>>[13/Jul/2018 11:18:57 +0000] 47316 MainThread heartbeat_tracker INFO HB stats (seconds): num:1 LIFE_MIN:0.02 min:0.02 mean:0.02 max:0.02 LIFE_MAX:0.02 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread __init__ INFO Agent UUID file was last modified at 2018-07-13 07:28:57.433583 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO ================================================================================ 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO SCM Agent Version: 5.15.0 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Agent Protocol Version: 4 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Using Host ID: 18798eec-35c8-4581-a872-ccae3168c7f6 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Using directory: /run/cloudera-scm-agent 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Using supervisor binary path: /usr/lib/cmf/agent/build/env/bin/supervisord 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Neither verify_cert_file nor verify_cert_dir are configured. Not performing validation of server certificates in HTTPS communication. These options can be configured in this agent's config.ini file to enable certificate validation. 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Agent Logging Level: INFO 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO No command line vars 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Found database jar: /usr/share/java/mysql-connector-java.jar 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Missing database jar: /usr/share/java/oracle-connector-java.jar (normal, if you're not using this database type) 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-42.1.4.jre7.jar 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Agent starting as pid 59177 user root(0) group root(0). 
able certificate validation. 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Agent Logging Level: INFO 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO No command line vars 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Found database jar: /usr/share/java/mysql-connector-java.jar 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Missing database jar: /usr/share/java/oracle-connector-java.jar (normal, if you're not using this database type) 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Found database jar: /usr/share/cmf/lib/postgresql-42.1.4.jre7.jar 
>>[13/Jul/2018 11:21:08 +0000] 59177 MainThread agent INFO Agent starting as pid 59177 user root(0) group root(0). 
END (0) 
end of agent logs. 
scm agent started 

Installation script completed successfully.

all done 
closing logging file descriptor 
 
 
Your help would be much appreciated 

 

 

1 REPLY 1
Highlighted

Re: Installation failed. Failed to receive heartbeat from agent

Super Guru

@Nihar

 

Hi,

 

First, you should make sure your hosts use Fully-Qualified Domain Names.  It appears that your hosts only have hostnames and no domains.  This will cause issues down the stretch as certain operations expect FQDN.

 

https://www.cloudera.com/documentation/enterprise/5-14-x/topics/cdh_ig_networknames_configure.html

 

I don't know that it has anything to do with the installation problem, but it is best to correct it before getting your cluster established.

 

The command you want to use is:

 

python -c "import socket; print socket.getfqdn(); print socket.gethostbyname(socket.getfqdn())"

 

It should return the FQDN and IP address of the local host where you run the python command.

----------------------------

Next, we need to know which hosts showed failure.  Did one or all of the agents fail due to the heartbeat not being received?

 

The agent logs seem to jump back and forth in time, but what we don't see are any heartbeat errors/exceptions.  This could mean the agent is heartbeating.

 

Check the "Hosts" tab of Cloudera Manager and see if they are (you can open another tab).  If the hosts show in good health, then it seems everything went ok.

 

Based on the failure in the wizard failure and the lack of heartbeat errors in the agent logs, it may also be that the heartbeat thread is stuck in some way on the hosts.

 

If the hosts don't show in good health in Cloudera Manager and a last heartbeat that was less than 15 seconds ago, try this:

 

On one of your hosts that failed in the wizard due to the heartbeat not being received, run this command 2 times about 5 seconds apart:

 

kill -SIGQUIT `cat /var/run/cloudera-scm-agent/cloudera-scm-agent.pid`

 

This will tell the agent to dump threads to the cloudera-scm-agent.log file.

If you share that with us, we can see if the the agent is hung up in some way trying to heartbeat.

 

In summary,

 

- Go through that network documentation from Cloudera and make sure you have everything working.  Make sure you have FQDNs

- Check to see if CM shows that the hosts have heartbeated within the last 15 seconds

- If not, get some thread dumps to see if the agents may be having trouble heartbeating.  I do see that it appears at least one heartbeat made it:
[13/Jul/2018 11:18:57 +0000] 47316 MainThread heartbeat_tracker INFO HB stats (seconds): num:1 LIFE_MIN:0.02 min:0.02 mean:0.02 max:0.02 LIFE_MAX:0.02