Support Questions

Find answers, ask questions, and share your expertise

cloudera agent has issue after OS RHEL 7.5 to 7.9 upgrade

avatar
Contributor

Hi ,

 

We upgraded OS of one of the server in cluster from RHEL 7.5 to 7.9. After that we are unable to start cloudera services from webui, getting below errors in agent log file. can anyone help me to get rid of this issue.

 

AttributeError: 'NoneType' object has no attribute 'get'
[30/Sep/2022 19:37:38 +0000] 1048 MainThread agent WARNING Long HB processing time: 7.00362992287
[30/Sep/2022 19:38:01 +0000] 1048 DnsResolutionMonitor throttling_logger INFO DnsTest not running. Java not located.
[30/Sep/2022 19:38:37 +0000] 1048 MonitorDaemon-Reporter firehoses INFO Creating a connection to the ACTIVITYMONITOR.
[30/Sep/2022 19:38:37 +0000] 1048 MonitorDaemon-Reporter firehoses INFO Creating a connection to the SERVICEMONITOR.
[30/Sep/2022 19:38:37 +0000] 1048 MonitorDaemon-Reporter firehoses INFO Creating a connection to the HOSTMONITOR.
[30/Sep/2022 19:38:37 +0000] 1048 MonitorDaemon-Reporter throttling_logger ERROR Error sending messages to firehose: mgmt-HOSTMONITOR-d85e01b86fca15e35281bae2797b3c77
Traceback (most recent call last):
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/monitor/firehose.py", line 121, in _send
self._port)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/avro/ipc.py", line 469, in __init__
self.conn.connect()
File "/usr/lib64/python2.7/httplib.py", line 837, in connect
self.timeout, self.source_address)
File "/usr/lib64/python2.7/socket.py", line 571, in create_connection
raise err
error: [Errno 111] Connection refused
[30/Sep/2022 19:47:09 +0000] 1048 MainThread heartbeat_tracker INFO HB stats (seconds): num:46 LIFE_MIN:0.00 min:0.00 mean:0.02 max:0.26 LIFE_MAX:0.01
~

1 ACCEPTED SOLUTION

avatar
Master Collaborator

[30/Sep/2022 19:37:38 +0000] 1048 MainThread agent WARNING Long HB processing time: 7.00362992287
[30/Sep/2022 19:38:01 +0000] 1048 DnsResolutionMonitor throttling_logger INFO DnsTest not running. Java not located.

Hey @hanu Thank you for writing this in our Community. 

SANITY CHECKS:

The above states that there could be  no JAVA_HOME set.

Q. Can you review if you this host has right JAVA_HOME set? or configured with a custom location?

0a. You can navigate to CM > Hosts > All Hosts> Affected Host > Quick Link > Host agent and check the value for JAVA_HOME inside the section "Component Info"

 

1. Make sure you have the proper values for CM agent under /etc/cloudera-scm-agent/config.ini?

1a. Compare and fix the UUID of the CM agent on the affected node, if need be.

Compare CM > Hosts > All Hosts > Affected Host UUID with /var/lib/cloudera-scm-agent/uuid
If they do not match, Carry on with the following steps:

Navigate to CM > Hosts > All Hosts

1. Identify the UUID
Click on Affected Host > Under “Details”
copy the “Host ID”. This is the affected CM agent's UUID.

2. SSH to affected host, and Stop the CM agent
#systemctl stop cloudera-scm-agent

3. Navigate to var/lib/cloudera-scm-agent/

4. Make a backup of the existing uuid file.
# cp var/lib/cloudera-scm-agent/uuid /tmp/agent-uuid.bk

5. Change the contents of the UUID file with the CM-UI uuid (i.e from step 1)
# echo -n “<UUID_from_CM-UI>” > /var/lib/cloudera-scm-agent/uuid

6.Start the Agent:
# systemctl start cloudera-scm-agent

2. Check Network configuration if there is no Heartbeat to CM Server.

Try comparing and update the following as needed to match one of the nodes that does not have any issues.

/etc/cloudera-scm-agent/config.ini
/etc/resolv.conf

/etc/hosts
/etc/nsswitch.conf

Once the above is updated/fixed, the CM agent requires restart:

# systemctl restart cloudera-scm-agent

Let us know if this helps, else, Share your observations.

Thanks!

View solution in original post

6 REPLIES 6

avatar
Master Collaborator

[30/Sep/2022 19:37:38 +0000] 1048 MainThread agent WARNING Long HB processing time: 7.00362992287
[30/Sep/2022 19:38:01 +0000] 1048 DnsResolutionMonitor throttling_logger INFO DnsTest not running. Java not located.

Hey @hanu Thank you for writing this in our Community. 

SANITY CHECKS:

The above states that there could be  no JAVA_HOME set.

Q. Can you review if you this host has right JAVA_HOME set? or configured with a custom location?

0a. You can navigate to CM > Hosts > All Hosts> Affected Host > Quick Link > Host agent and check the value for JAVA_HOME inside the section "Component Info"

 

1. Make sure you have the proper values for CM agent under /etc/cloudera-scm-agent/config.ini?

1a. Compare and fix the UUID of the CM agent on the affected node, if need be.

Compare CM > Hosts > All Hosts > Affected Host UUID with /var/lib/cloudera-scm-agent/uuid
If they do not match, Carry on with the following steps:

Navigate to CM > Hosts > All Hosts

1. Identify the UUID
Click on Affected Host > Under “Details”
copy the “Host ID”. This is the affected CM agent's UUID.

2. SSH to affected host, and Stop the CM agent
#systemctl stop cloudera-scm-agent

3. Navigate to var/lib/cloudera-scm-agent/

4. Make a backup of the existing uuid file.
# cp var/lib/cloudera-scm-agent/uuid /tmp/agent-uuid.bk

5. Change the contents of the UUID file with the CM-UI uuid (i.e from step 1)
# echo -n “<UUID_from_CM-UI>” > /var/lib/cloudera-scm-agent/uuid

6.Start the Agent:
# systemctl start cloudera-scm-agent

2. Check Network configuration if there is no Heartbeat to CM Server.

Try comparing and update the following as needed to match one of the nodes that does not have any issues.

/etc/cloudera-scm-agent/config.ini
/etc/resolv.conf

/etc/hosts
/etc/nsswitch.conf

Once the above is updated/fixed, the CM agent requires restart:

# systemctl restart cloudera-scm-agent

Let us know if this helps, else, Share your observations.

Thanks!

avatar
Community Manager

@hanumanth Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future. Thanks


Regards,

Diana Torres,
Community Moderator


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:

avatar
Master Collaborator

Hey @hanumanth 

Circling back to know if the posted checks helped you with this task. 

Additionally, Please share what was done to sort this situation.

HTH!

avatar
Contributor

Hey Team,

 

As suggested by @vaishaakb  i done the checks of UUID and properly set the java home then cloudera agent able to communicate with server. Thanks for your help

avatar
Contributor

Thank you all who are supporting or helping on this issue

avatar
Master Collaborator

Hey @hanumanth Thank you for marking this post as Accepted Solution.

If you find my sanity checks helpful, you can also thank me by clicking on the thumbs up button:)

 

V