Support Questions

Find answers, ask questions, and share your expertise

This host is not in contact with the Host Monitor.

avatar
Rising Star

Hi all !

 

Its now time to create replace our Cloudera 4 by Cloudera 5 !

 

I start the first,as install new hardware ! Good thing done !

 

Now I start the application installation/configuration. FYI, I will install Cloudera 5.7.1.

I have a machine for the Cloudera Manager, one for the a MySQL server with "monitor/metastore".

 

I install the Manager with the MySQL information so fr the "scm" database.

Everything fine.

 

Then I intregrate the Manager machine to the a new cluster, configure the Host Monitor to access to the monitor database.

It works fine, now all Cloudera Manager Service works fine. I think !

 

Now I would like to integrate 4 new node to cluster.

2 will serve for "HA" name node, ans 2 others as datanode. Six others nodes will join the cluster later.

 

So with Cloudera Manager, I launch the wizard to add the new hosts. 

Answer to all question, and the installation start. Everything fine and installation is done with success.

 

But after that, it have to run the "inspector job" on all hosts.

And here I encount this warning:

newnode.domain.ltd: Command aborted because of exception: Command timed-out after 150 seconds

4 hosts are reporting with NONE CDH version

There are mismatched versions across the system, which will cause failures. See below for details on which hosts are running what versions of components.

I use a local reposync, and with sync only the 5.7.1 version. So no problems with differents version, 5.7.1 is installed everywhere.

 

Then, if I go to the Manager interface, I can see all hosts int the "Hosts" section. But all are in red status.

 

If I select one, I have this message:

This host is in contact with the Cloudera Manager Server. This host is not in contact with the Host Monitor.

So the Cloudera Manager Agent seems to be ok, but it seems to can't contact the "Host Monitor"...

But the "Host Monitor" is install and configured on the same server. So... 

 

And the status of ths "Host Monitor" is green, and can contact my remote MySQL database.

Si I don't know why I get this error.

 

No firewall between my machine, no Selinux.

 

I don't know why they can contact the "Cloudera Manager" but not the "Host Monitor"

 

The /etc/cloudera-scm-agent/config.ini config file is ok. Good IP and port.

server_host=10.x.x.x
server_port=7182

On cloudera Manager, the "cloudera-scm-server" and "cloudera-scm-agent" run nicely.

 

On my new node, "cloudera-scm-agent" run nicely too. But I get this error in log.

But not sure it could be the reason, and if it is, I don't know how to solve it.

 

[03/Aug/2016 20:56:32 +0000] 158533 MainThread agent        INFO     Flood daemon (re)start attempt
[03/Aug/2016 20:56:32 +0000] 158533 MainThread agent        ERROR    Failed to handle Heartbeat Response: {u'firehoses': [{u'roletype': u'ACTIVITYMONITOR', u'rolename': u'mgmt-ACTIVITYMONITOR-728c1b31088c1d8ddc2547d70b884cf7', u'port': 9999, u'report_interval': 60, u'address': u'clouderamanager.domain.ltd'}, {u'roletype': u'SERVICEMONITOR', u'rolename': u'mgmt-SERVICEMONITOR-728c1b31088c1d8ddc2547d70b884cf7', u'port': 9997, u'report_interval': 60, u'address': u'clouderamanager.domain.ltd'}, {u'roletype': u'HOSTMONITOR', u'rolename': u'mgmt-HOSTMONITOR-728c1b31088c1d8ddc2547d70b884cf7', u'port': 9995, u'report_interval': 60, u'address': u'clouderamanager.domain.ltd'}], u'rm_enabled': False, u'client_configs': [], u'create_parcel_symlinks': True, u'server_managed_parcels': [], u'extra_configs': None, u'host_collection_config_data': [{u'config_name': u'host_network_interface_collection_filter', u'config_value': u'^lo$'}, {u'config_name': u'host_disk_collection_filter', u'config_value': u'^$'}, {u'config_name': u'host_fs_collection_filter', u'config_value': u'^$'}, {u'config_name': u'host_log_tailing_config', u'config_value': u'{}\n'}, {u'config_name': u'host_dns_resolution_duration_thresholds', u'config_value': u'{"critical":"never","warning":"1000.0"}'}, {u'config_name': u'host_dns_resolution_enabled', u'config_value': u'true'}, {u'config_name': u'host_clock_offset_thresholds', u'config_value': u'{"critical":"10000.0","warning":"3000.0"}'}], u'apply_parcel_users_groups_permissions': True, u'flood_torrent_port': 7191, u'log_tailing_config': u'{}\n', u'active_parcels': {}, u'flood_rack_peers': [u'10.2.0.33:7191', u'10.2.0.31:7191', u'10.2.0.34:7191', u'10.2.0.29:7191', u'10.2.0.30:7191'], u'retain_parcels_in_cache': True, u'processes': [{u'status_links': {}, u'name': u'cluster-host-inspector', u'config_generation': 0, u'configuration_data': 'PK\x03\x04\x14\x00\x08\x08\x08\x00\x83\x90\x03I\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\n\x00\x00\x00input.json\xb5\xd2\xbb\x0e\xc2 \x18\x05\xe0\xbdOA\x98[\x02\xbd\x98\xe8\xd6\xe8\xd0\xc5\xd4\xb8\x1a\x07\x14\x92\x12)4\xa5\x9d\x9a\xbe\xbb\x80q\x04\xbb8r\xfe\xc3\x07\t,\t\x00\x90J\xd9h3\x19\x08\x0e\xe0\x06\x16\x1b\xd9\xb0\xb3\x89\xa2=w!4Jd\x1deZ\x0f\x19\xc69\x923\x13\x14\xd9Y\xfa\xe9\n\xe6Z\xd5w5\xd4\x8c\x8d\xdcx\x0f\x12\x8cr\x84Q\x81\xa1\x9d\xae\xe9o~\x17\xe0\xf3(_n\xe5I\x80/c|\xbe\xdf\xcaW\x01\xbe\x88\xde\xbe\xd8\xca\x17\x01\x9eDy\xe2ypw%(\x94\x19\xf8s\x12Z\xf9\x92\x9a\xa5\xf4\xf9\x8b\x8f\x0f>js\xe5T\xf6~{S\x9f\xda\xf6\x82\x8e\xed\xd9\x1f\x06\xa7N\x18\xf7Q\xdc\xf0\xaf\xcf\x98\xacoPK\x07\x089m\\\xdd\xbe\x00\x00\x00\x98\x02\x00\x00PK\x01\x02\x14\x00\x14\x00\x08\x08\x08\x00\x83\x90\x03I9m\\\xdd\xbe\x00\x00\x00\x98\x02\x00\x00\n\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00input.jsonPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x008\x00\x00\x00\xf6\x00\x00\x00\x00\x00', u'refresh_files': [], u'user': u'root', u'parcels': {}, u'auto_restart': False, u'run_generation': 2, u'extra_groups': [], u'environment': {}, u'optional_tags': [], u'running': False, u'program': u'mgmt/mgmt.sh', u'required_tags': [], u'arguments': [u'inspector', u'input.json', u'output.json', u'DEFAULT'], u'special_file_info': [], u'group': u'root', u'id': 34, u'resources': [], u'one_off': True}], u'server_manages_parcels': True, u'heartbeat_interval': 15, u'parcels_directory': u'/opt/cloudera/parcels', u'host_id': u'8a18c6eb-7f32-4e90-a6f9-88d1feeacd21', u'eventserver_host': u'clouderamanager.domain.ltd', u'enabled_metric_reporters': [u'ACCUMULO16', u'ACCUMULO16', u'KEYTRUSTEE-KMS_KEYTRUSTEE', u'KMS_KEYTRUSTEE', u'SPARK_ON_YARN-SPARK_YARN_HISTORY_SERVER', u'SPARK_YARN_HISTORY_SERVER', u'SOLR-SOLR_SERVER', u'SOLR_SERVER', u'HBASE-HBASERESTSERVER', u'HBASERESTSERVER', u'HOST', u'KEYTRUSTEE_SERVER-KEYTRUSTEE_PASSIVE_SERVER', u'KEYTRUSTEE_PASSIVE_SERVER', u'IMPALA-STATESTORE', u'STATESTORE', u'SPARK', u'SPARK', u'HBASE', u'HBASE', u'ACCUMULO-ACCUMULO_TRACER', u'ACCUMULO_TRACER', u'HDFS-DATANODE', u'DATANODE', u'ACCUMULO-ACCUMULO_MASTER', u'ACCUMULO_MASTER', u'YARN-RESOURCEMANAGER', u'RESOURCEMANAGER', u'HUE-HUE_SERVER', u'HUE_SERVER', u'ACCUMULO-ACCUMULO_MONITOR', u'ACCUMULO_MONITOR', u'MGMT-EVENTSERVER', u'EVENTSERVER', u'MGMT-NAVIGATORMETASERVER', u'NAVIGATORMETASERVER', u'HBASE-MASTER', u'MASTER', u'KAFKA-KAFKA_BROKER', u'KAFKA_BROKER', u'KEYTRUSTEE_SERVER-DB_PASSIVE', u'DB_PASSIVE', u'HBASE-REGIONSERVER', u'REGIONSERVER', u'SPARK_ON_YARN', u'SPARK_ON_YARN', u'MGMT-REPORTSMANAGER', u'REPORTSMANAGER', u'MGMT-SERVICEMONITOR', u'SERVICEMONITOR', u'IMPALA-IMPALAD', u'IMPALAD', u'MGMT-ALERTPUBLISHER', u'ALERTPUBLISHER', u'HIVE-HIVESERVER2', u'HIVESERVER2', u'MGMT-ACTIVITYMONITOR', u'ACTIVITYMONITOR', u'ISILON', u'ISILON', u'YARN-NODEMANAGER', u'NODEMANAGER', u'MAPREDUCE-FAILOVERCONTROLLER', u'FAILOVERCONTROLLER', u'ACCUMULO', u'ACCUMULO', u'MAPREDUCE', u'MAPREDUCE', u'ZOOKEEPER', u'ZOOKEEPER', u'KMS', u'KMS', u'ACCUMULO16-ACCUMULO16_TRACER', u'ACCUMULO16_TRACER', u'ACCUMULO16-ACCUMULO16_MONITOR', u'ACCUMULO16_MONITOR', u'MGMT-HOSTMONITOR', u'HOSTMONITOR', u'YARN-JOBHISTORY', u'JOBHISTORY', u'KEYTRUSTEE', u'KEYTRUSTEE', u'HDFS-JOURNALNODE', u'JOURNALNODE', u'KAFKA', u'KAFKA', u'IMPALA', u'IMPALA', u'SPARK-SPARK_HISTORY_SERVER', u'SPARK_HISTORY_SERVER', u'KEYTRUSTEE_SERVER-KEYTRUSTEE_ACTIVE_SERVER', u'KEYTRUSTEE_ACTIVE_SERVER', u'HDFS-NAMENODE', u'NAMENODE', u'HUE-BEESWAX_SERVER', u'BEESWAX_SERVER', u'SOLR', u'SOLR', u'ACCUMULO16-ACCUMULO16_TSERVER', u'ACCUMULO16_TSERVER', u'MAPREDUCE-TASKTRACKER', u'TASKTRACKER', u'IMPALA-CATALOGSERVER', u'CATALOGSERVER', u'HDFS-DSSDDATANODE', u'DSSDDATANODE', u'SENTRY', u'SENTRY', u'ACCUMULO16-ACCUMULO16_GC', u'ACCUMULO16_GC', u'MGMT-NAVIGATOR', u'NAVIGATOR', u'HIVE', u'HIVE', u'HBASE-HBASETHRIFTSERVER', u'HBASETHRIFTSERVER', u'SQOOP-SQOOP_SERVER', u'SQOOP_SERVER', u'KAFKA-KAFKA_MIRROR_MAKER', u'KAFKA_MIRROR_MAKER', u'FLUME', u'FLUME', u'HUE', u'HUE', u'HDFS-SECONDARYNAMENODE', u'SECONDARYNAMENODE', u'SENTRY-SENTRY_SERVER', u'SENTRY_SERVER', u'ACCUMULO-ACCUMULO_TSERVER', u'ACCUMULO_TSERVER', u'ACCUMULO-ACCUMULO_GC', u'ACCUMULO_GC', u'HIVE-HIVEMETASTORE', u'HIVEMETASTORE', u'IMPALA-LLAMA', u'LLAMA', u'ACCUMULO16-ACCUMULO16_MASTER', u'ACCUMULO16_MASTER', u'SPARK-SPARK_WORKER', u'SPARK_WORKER', u'MGMT', u'MGMT', u'HIVE-WEBHCAT', u'WEBHCAT', u'SQOOP', u'SQOOP', u'HUE-HUE_LOAD_BALANCER', u'HUE_LOAD_BALANCER', u'ACCUMULO-ACCUMULO_LOGGER', u'ACCUMULO_LOGGER', u'HDFS', u'HDFS', u'FLUME-AGENT', u'AGENT', u'OOZIE', u'OOZIE', u'SQOOP_CLIENT', u'SQOOP_CLIENT', u'OOZIE-OOZIE_SERVER', u'OOZIE_SERVER', u'KMS-KMS', u'KMS', u'HDFS-FAILOVERCONTROLLER', u'FAILOVERCONTROLLER', u'KS_INDEXER', u'KS_INDEXER', u'SPARK-SPARK_MASTER', u'SPARK_MASTER', u'YARN', u'YARN', u'ZOOKEEPER-SERVER', u'SERVER', u'HDFS-NFSGATEWAY', u'NFSGATEWAY', u'HDFS-HTTPFS', u'HTTPFS', u'HUE-KT_RENEWER', u'KT_RENEWER', u'KEYTRUSTEE_SERVER', u'KEYTRUSTEE_SERVER', u'KEYTRUSTEE_SERVER-DB_ACTIVE', u'DB_ACTIVE', u'MAPREDUCE-JOBTRACKER', u'JOBTRACKER', u'KS_INDEXER-HBASE_INDEXER', u'HBASE_INDEXER'], u'flood_seed_timeout': 100, u'eventserver_port': 7185}
Traceback (most recent call last):
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1335, in handle_heartbeat_response
    self._handle_heartbeat_response(response)
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1357, in _handle_heartbeat_response
    response["flood_torrent_port"])
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1823, in handle_heartbeat_flood
    self.mkabsdir(flood_dir, user=FLOOD_FS_USER, group=FLOOD_FS_GROUP, mode=0755)
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1918, in mkabsdir
    path_grp = grp.getgrgid(stat_info.st_gid)[0]
KeyError: 'getgrgid(): gid not found: 167

Do you have an idea why this error ? 

 

Hope somebody can help me to have a good configuration and my node are green.

 

Regards,

Fabien

1 ACCEPTED SOLUTION

avatar
Rising Star

It works !!

 

I find the workaround. After read again the log, I find weird the python error. So I would like to check again this error.

 

The user zookeeper have the uid 167 and gid 153.

I change the gid to 167 in /etc/passwd and /etc/group, and the db test works now !!

I can now continue the setup.

 

But I don't know why the setup wasn't good... Maybe a bug in Redhat package  for the 5.7.1 version. I don't know.

 

But finally, it works, I finish my setup and my cluster is now live ! 

 

Now I'g going to configure and optimize the configuration ! 

Maybe the post can help somebody.

 

Regards,

View solution in original post

9 REPLIES 9

avatar
Rising Star

A small feedback.

 

During the installation. it ask me about the login/password, and database name for the "Activity monitor" but not for others services.

 

I add these informations manually later in the Cloudera Manager for each services in the "Dtabases legacy mysql parameters".

 

But if I go the to database, I see many tables in the "Activity Monitor" but zero table in others.

 

So it don't populate the database... 

 

How can I do that ? 

Maybe my problem come from here...

 

If somebody have an idea ! I take it !

avatar
Rising Star

I try to remove the "CLoudera Manager Management" service.

 

I launch with the wizard a new instllation of "CLoudera Manager Management" service.

During installation it ask me again only about the "Activity Monitor"

 

Not ask about the "Service Monitor" and "Host Monitor". So my database stay empty...

 

We have a small lab, running with the 5.5.2 version, and when I install it, it ask for each service, not only the "Activity Monitor"...

 

I still don't know why I'm not ask about these databases and still conitnue to have my new node in red with this error message...

 

Nothing seems to be interesting in the Cloudera Manager log server.

 

 

avatar
Rising Star

More informations.

 

From the CLoundera Manager, I launch a new "Inspector hosts".

 

For my Cloudera Manager, it's ok.

 

But for my other node, I got this error:

Command aborted because of exception: Command timed-out after 150 seconds

But here , I can click on "Full log file"

 

And I got a new page with his error:

HTTP ERROR 403

Problem accessing /cmf/process/81/logs. Reason:

    http://mynode.domain.ltd:9000/process/81-host-inspector/files/logs/stderr.log
The server declined access to the page or resource.

I'm going to check if I can find something about that, but maybe encount this error before ...

avatar
Rising Star

Problem solved ! 

 

I remove the "Cloudera Manager Management" cluster, and the "Cluster 1".

I decommission and remove all node and then, add it again with the wizard.

 

I continue to have this message in the log but now the wizard continue and I can go to next step.

 

But now I encount another problem, maybe always in relation with this one...

 

I configure my differents services on each node, and click next.

 

I'm now on the "Database configuration" page.

 

I have to configure the "Hive" "Activity Monitor" and "Oozie" database parameters.

 

So i give all informations, but for Hive and Oozie, it can't connect. It stay on test connection, and nothing append.

 

I try to install mysql client on different host where I configure Hive and Oozie, and the connection works fine.

 

So I add the mysql-connector-java.jat in the /usr/share/java, then create a symlink in the /var/lib/oozie folder, and in the other host, in the /var/lib/hive/.

But still the same state...

 

I launch a tcpdump on my MySQL server, but nothing. No connection try to be establish betweeen my "oozie server" node and my MySQL server...

 

I try to restart my node but still the same.

 

Si it works with mysql client, but not with the mysql-connector-java.jar file. 

Do I forget something about that ? 

 

FYI, I install the 5.1.39 version. I install it by the "yum" command, but see in the documentation that it wasn't recommanded, so I remove it and download manually the jar file.

 

Hope somebody can help me !

 

avatar
Rising Star

Not for sure...

I'm lost...

 

I don't know why my node can't connect to the database.

Only the Cloudera Manager nodeconnect with success to database.

 

So I try to remove all cluster and node against, and follow the procedure to clean all node, execpt the Cloudera Manager.

 

Next I select "Host" and "Add new hosts to cluster".

I give all range of my new node, and eveyrthing fine. The wizard, install jdk, cloudera manager agent, and cdh package with success.

 

Now they appears in the le the hosts list, aith the Cloudera Manager but I get this:

cdh5_01.pngAs we see, all node have a non-detetced version of CDH.

The third, is the Cloudera Manager.

 

If we check package, all have the 5.7.1 version.

 

 

 

 

 

 

 

 

 

 

Then, if I would like to create a new cluster, I all of my hosts in the "Currently Managed hosts".

Then I select Cloudera Express, then "Use package" .

 

And in the next section:

Detecting CDH versions on all hosts
 
All hosts have the same CDH version.
 

The following host(s) are running CDH 5.7.1: clouderamanagernode.domain.ltd

So it only see the Cloudera Manager node and not the others...

 

I search again but I don't find why...

 

The installation is done with success of all package in our node.

We have the same 5.7..1 version on each one.

 

So why it can find the CDH version ?? Maybe if we find, we can do the services configuration after with success.

 

Hope somebody can help.

 

Have a good week-end  !

 

avatar
Rising Star

Not again...

 

I remove again the node and add it again.

Now they appears in the "Hosts" list and with the "CDH5" version.

 

But if I try to create a new cluster, select all node, affect service to each node, then I have to give database informations for Hive, Oozie and Activity Monitor.

 

All information are ok, but when I push "Test connection", only the "Activity Monitor" works...

 

I download again manually the mysql-connector-java.jar and put in the /ush/share/java, create a symlink in the /var/lib/hive/ and /var/lib/oozie folder, but nothing...

 

The database connection for these 2 databse stay under connection and never terminate...

 

Any help for that ?

 

Always, if I install mysql-client and try connection to my databases with same informations, it works fine.

avatar
Master Guru

The agents are responsible for db test, so i'd check the agent logs on hosts where Hive and Oozie are configured to run for clues.  Cloudera Manager will wait for the agent to tell it if the test is done.

 

-Ben

avatar
Rising Star

Hi bgolley,

 

Thanks for you help.

 

Ok, il can check only on the client so.

 

But I've only the Cloudera agent log which give me an output.

 

And it's like that:

 

[08/Aug/2016 13:39:02 +0000] 13735 MainThread agent        INFO     Flood daemon (re)start attempt
[08/Aug/2016 13:39:02 +0000] 13735 MainThread agent        ERROR    Failed to handle Heartbeat Response: {u'firehoses': [], u'rm_enabled': False, u'client_configs': [], u'create_parcel_symlinks': True, u'server_managed_parcels': [], u'extra_configs': None, u'host_collection_config_data': [{u'config_name': u'host_network_interface_collection_filter', u'config_value': u'^lo$'}, {u'config_name': u'host_disk_collection_filter', u'config_value': u'^$'}, {u'config_name': u'host_fs_collection_filter', u'config_value': u'^$'}, {u'config_name': u'host_log_tailing_config', u'config_value': u'{}\n'}, {u'config_name': u'host_dns_resolution_duration_thresholds', u'config_value': u'{"critical":"never","warning":"1000.0"}'}, {u'config_name': u'host_dns_resolution_enabled', u'config_value': u'true'}, {u'config_name': u'host_clock_offset_thresholds', u'config_value': u'{"critical":"10000.0","warning":"3000.0"}'}], u'apply_parcel_users_groups_permissions': True, u'flood_torrent_port': 7191, u'log_tailing_config': u'{}\n', u'active_parcels': {}, u'flood_rack_peers': [u'10.2.0.34:7191', u'10.2.0.31:7191', u'10.2.0.30:7191', u'10.2.0.33:7191', u'10.2.0.29:7191'], u'retain_parcels_in_cache': True, u'processes': [{u'status_links': {}, u'name': u'OOZIE.OOZIE_SERVER-test-db-connection', u'config_generation': 0, u'configuration_data': 'PK\x03\x04\x14\x00\x08\x08\x08\x00\xd7l\x08I\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\r\x00\x00\x00db.propertiesu\xcc1\n\x800\x0c@\xd1\xdd\xbb\x18*\x08\xe2\xd0\x0bx\x01\xe7\xd8\x04,\xb4Mm\xac\xa2\xa7wW\x9c\xdf\xe7;\x89\xe0\x82T\xe2\x82@\x0b$\x8clEn\xcf\x8d{QF\xd5S\n\xd9\xf1\x98]\x97\x8ei\xf0\xbc|\xaa\xfd\xcal\xe3\xa5[\xf8PU.?\xefUt\xb7\x9a|\xbb"\x89\xe4\xd6\x98\x1eB%\x8f\x10\xb9y\x00PK\x07\x08\xf5\xc6w;b\x00\x00\x00\xa4\x00\x00\x00PK\x01\x02\x14\x00\x14\x00\x08\x08\x08\x00\xd7l\x08I\xf5\xc6w;b\x00\x00\x00\xa4\x00\x00\x00\r\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00db.propertiesPK\x05\x06\x00\x00\x00\x00\x01\x00\x01\x00;\x00\x00\x00\x9d\x00\x00\x00\x00\x00', u'refresh_files': [], u'user': u'root', u'parcels': {}, u'auto_restart': False, u'run_generation': 1, u'extra_groups': [], u'environment': {}, u'optional_tags': [], u'running': True, u'program': u'dbconnection/test_db_connection.sh', u'required_tags': [], u'arguments': [u'db.properties'], u'special_file_info': [], u'group': u'root', u'id': 191, u'resources': [], u'one_off': True}], u'server_manages_parcels': True, u'heartbeat_interval': 15, u'parcels_directory': u'/opt/cloudera/parcels', u'host_id': u'549b5395-76f6-43ee-86d2-ef7091a71ce1', u'eventserver_host': None, u'enabled_metric_reporters': [u'ACCUMULO16', u'ACCUMULO16', u'KEYTRUSTEE-KMS_KEYTRUSTEE', u'KMS_KEYTRUSTEE', u'SPARK_ON_YARN-SPARK_YARN_HISTORY_SERVER', u'SPARK_YARN_HISTORY_SERVER', u'SOLR-SOLR_SERVER', u'SOLR_SERVER', u'HBASE-HBASERESTSERVER', u'HBASERESTSERVER', u'HOST', u'KEYTRUSTEE_SERVER-KEYTRUSTEE_PASSIVE_SERVER', u'KEYTRUSTEE_PASSIVE_SERVER', u'IMPALA-STATESTORE', u'STATESTORE', u'SPARK', u'SPARK', u'HBASE', u'HBASE', u'ACCUMULO-ACCUMULO_TRACER', u'ACCUMULO_TRACER', u'HDFS-DATANODE', u'DATANODE', u'ACCUMULO-ACCUMULO_MASTER', u'ACCUMULO_MASTER', u'YARN-RESOURCEMANAGER', u'RESOURCEMANAGER', u'HUE-HUE_SERVER', u'HUE_SERVER', u'ACCUMULO-ACCUMULO_MONITOR', u'ACCUMULO_MONITOR', u'MGMT-EVENTSERVER', u'EVENTSERVER', u'MGMT-NAVIGATORMETASERVER', u'NAVIGATORMETASERVER', u'HBASE-MASTER', u'MASTER', u'KAFKA-KAFKA_BROKER', u'KAFKA_BROKER', u'KEYTRUSTEE_SERVER-DB_PASSIVE', u'DB_PASSIVE', u'HBASE-REGIONSERVER', u'REGIONSERVER', u'SPARK_ON_YARN', u'SPARK_ON_YARN', u'MGMT-REPORTSMANAGER', u'REPORTSMANAGER', u'MGMT-SERVICEMONITOR', u'SERVICEMONITOR', u'IMPALA-IMPALAD', u'IMPALAD', u'MGMT-ALERTPUBLISHER', u'ALERTPUBLISHER', u'HIVE-HIVESERVER2', u'HIVESERVER2', u'MGMT-ACTIVITYMONITOR', u'ACTIVITYMONITOR', u'ISILON', u'ISILON', u'YARN-NODEMANAGER', u'NODEMANAGER', u'MAPREDUCE-FAILOVERCONTROLLER', u'FAILOVERCONTROLLER', u'ACCUMULO', u'ACCUMULO', u'MAPREDUCE', u'MAPREDUCE', u'ZOOKEEPER', u'ZOOKEEPER', u'KMS', u'KMS', u'ACCUMULO16-ACCUMULO16_TRACER', u'ACCUMULO16_TRACER', u'ACCUMULO16-ACCUMULO16_MONITOR', u'ACCUMULO16_MONITOR', u'MGMT-HOSTMONITOR', u'HOSTMONITOR', u'YARN-JOBHISTORY', u'JOBHISTORY', u'KEYTRUSTEE', u'KEYTRUSTEE', u'HDFS-JOURNALNODE', u'JOURNALNODE', u'KAFKA', u'KAFKA', u'IMPALA', u'IMPALA', u'SPARK-SPARK_HISTORY_SERVER', u'SPARK_HISTORY_SERVER', u'KEYTRUSTEE_SERVER-KEYTRUSTEE_ACTIVE_SERVER', u'KEYTRUSTEE_ACTIVE_SERVER', u'HDFS-NAMENODE', u'NAMENODE', u'HUE-BEESWAX_SERVER', u'BEESWAX_SERVER', u'SOLR', u'SOLR', u'ACCUMULO16-ACCUMULO16_TSERVER', u'ACCUMULO16_TSERVER', u'MAPREDUCE-TASKTRACKER', u'TASKTRACKER', u'IMPALA-CATALOGSERVER', u'CATALOGSERVER', u'HDFS-DSSDDATANODE', u'DSSDDATANODE', u'SENTRY', u'SENTRY', u'ACCUMULO16-ACCUMULO16_GC', u'ACCUMULO16_GC', u'MGMT-NAVIGATOR', u'NAVIGATOR', u'HIVE', u'HIVE', u'HBASE-HBASETHRIFTSERVER', u'HBASETHRIFTSERVER', u'SQOOP-SQOOP_SERVER', u'SQOOP_SERVER', u'KAFKA-KAFKA_MIRROR_MAKER', u'KAFKA_MIRROR_MAKER', u'FLUME', u'FLUME', u'HUE', u'HUE', u'HDFS-SECONDARYNAMENODE', u'SECONDARYNAMENODE', u'SENTRY-SENTRY_SERVER', u'SENTRY_SERVER', u'ACCUMULO-ACCUMULO_TSERVER', u'ACCUMULO_TSERVER', u'ACCUMULO-ACCUMULO_GC', u'ACCUMULO_GC', u'HIVE-HIVEMETASTORE', u'HIVEMETASTORE', u'IMPALA-LLAMA', u'LLAMA', u'ACCUMULO16-ACCUMULO16_MASTER', u'ACCUMULO16_MASTER', u'SPARK-SPARK_WORKER', u'SPARK_WORKER', u'MGMT', u'MGMT', u'HIVE-WEBHCAT', u'WEBHCAT', u'SQOOP', u'SQOOP', u'HUE-HUE_LOAD_BALANCER', u'HUE_LOAD_BALANCER', u'ACCUMULO-ACCUMULO_LOGGER', u'ACCUMULO_LOGGER', u'HDFS', u'HDFS', u'FLUME-AGENT', u'AGENT', u'OOZIE', u'OOZIE', u'SQOOP_CLIENT', u'SQOOP_CLIENT', u'OOZIE-OOZIE_SERVER', u'OOZIE_SERVER', u'KMS-KMS', u'KMS', u'HDFS-FAILOVERCONTROLLER', u'FAILOVERCONTROLLER', u'KS_INDEXER', u'KS_INDEXER', u'SPARK-SPARK_MASTER', u'SPARK_MASTER', u'YARN', u'YARN', u'ZOOKEEPER-SERVER', u'SERVER', u'HDFS-NFSGATEWAY', u'NFSGATEWAY', u'HDFS-HTTPFS', u'HTTPFS', u'HUE-KT_RENEWER', u'KT_RENEWER', u'KEYTRUSTEE_SERVER', u'KEYTRUSTEE_SERVER', u'KEYTRUSTEE_SERVER-DB_ACTIVE', u'DB_ACTIVE', u'MAPREDUCE-JOBTRACKER', u'JOBTRACKER', u'KS_INDEXER-HBASE_INDEXER', u'HBASE_INDEXER'], u'flood_seed_timeout': 100, u'eventserver_port': None}
Traceback (most recent call last):
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1335, in handle_heartbeat_response
    self._handle_heartbeat_response(response)
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1357, in _handle_heartbeat_response
    response["flood_torrent_port"])
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1823, in handle_heartbeat_flood
    self.mkabsdir(flood_dir, user=FLOOD_FS_USER, group=FLOOD_FS_GROUP, mode=0755)
  File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.7.1-py2.7.egg/cmf/agent.py", line 1918, in mkabsdir
    path_grp = grp.getgrgid(stat_info.st_gid)[0]
KeyError: 'getgrgid(): gid not found: 167'

I find nothing which can help me for this error. 

 

 

In the logs details, we can see this line:

 

host_network_interface_collection_filter', u'config_value': u'^lo$'

But not fir sure it's the mistake.

 

 

In the Cloudera wizard, I give for each database the good remote host name ans mysql information. So I don't know it the pattern ^lo$ is the problem.

 

I check the /var/log/{oozie,hive} log directory, but nohting. No log. And I'm on the machine where I try to install the Hive and Oozie service.

 

Maybe it can give you an idea from the origin of the error !

 

Thanks again bgooley

 

avatar
Rising Star

It works !!

 

I find the workaround. After read again the log, I find weird the python error. So I would like to check again this error.

 

The user zookeeper have the uid 167 and gid 153.

I change the gid to 167 in /etc/passwd and /etc/group, and the db test works now !!

I can now continue the setup.

 

But I don't know why the setup wasn't good... Maybe a bug in Redhat package  for the 5.7.1 version. I don't know.

 

But finally, it works, I finish my setup and my cluster is now live ! 

 

Now I'g going to configure and optimize the configuration ! 

Maybe the post can help somebody.

 

Regards,