Support Questions
Find answers, ask questions, and share your expertise

Cloudera Enterprise Trial 6.1.0 - Add Host Error

Explorer

I am getting the following error when I am trying to add host to cloudera:

 

[25/Mar/2020 04:56:45 +0000] 1611 MainThread agent ERROR Failed to handle Heartbeat Response:{....} (A big response) 

Traceback (most recent call last):
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1526, in handle_heartbeat_response
self._handle_heartbeat_response(response)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1661, in _handle_heartbeat_response
self._update_parcel_activation_state(response)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1572, in _update_parcel_activation_state
manage_old_parcels = old_response.get("create_parcel_symlinks")
AttributeError: 'NoneType' object has no attribute 'get'

 

This causes the following error:

 

Traceback (most recent call last):
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/parcel.py", line 125, in refresh
pid = ParcelId.dir(child)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/parcel_id.py", line 75, in dir
raise Exception("Invalid parcel directory: %s" % (dir))
Exception: Invalid parcel directory: CDH

 

I have checked the /etc/hosts file, which seems fine and consistent. Any other way to debug this.

Thanks,

12 REPLIES 12

@ankesh_clo Can you check /var/lib/cloudera-scm-agent/ dir on the new host, Then delete the file response.avro if exists. 

After that go to CM > Hosts > All hosts > Click on the newly added host and match the HOST ID with /var/lib/cloudera-scm-agent/uuid file if this is not in sync then please modify the uuid file as per HOST ID and restart the agent.


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

@GangWar  Thanks for the reply.

But this didn't solve the issue. There is no parcel downloaded and no progress.  The same error appears on the agent as posted above.

 

For restarting the agent, I use sudo service cloudera-scm-agent restart.

 

Please let me know where I am going wrong.

Thanks!

@ankesh_clo  This seems an issue with Symlink as well looking at the stack trace. Try
the below method once.

1. Stop the Agent.
2. Remove the broken symlinks from /etc/alternatives
and corresponding conf files from /var/lib/alternatives.
3. Remove the
parcels/files from:
/opt/cloudera/parcels
/opt/cloudera/parcels/.flood
/opt/cloudera/parcel-cache4.
4. Start the agent. The agent will
redistribute the parcel and fix the Alternatives.

 


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

@GangWar I do not see broken links in /etc/alternatives and there is no folder named /var/lib/alternatives.

 

Also, there are no parcel files downloaded on the new host. The parcel and parcel-cache folders are empty.

 

Should I add a more detailed log? 

Thanks,

Yes, Can you upload agent CM server log file and agent log file (New host), let's see if we can find something.

Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

@GangWar  Thanks for the help. I am attaching the link to log file from server and agent host.

 

Agent: https://ideone.com/TNKtRZ

Server: https://ideone.com/RHt1Yl

 

Ps: Sorry there was an issue uploading the log file. I have uploaded the relevant portion.

Please visit the link and download the log file (download at top of editor, below url box) 

@ankesh_clo Looking at the logs the issue is also seems because of wrong parcel URL.  

2020-04-29 01:28:33,070 ERROR ParcelUpdateService:com.cloudera.parcel.components.ParcelDownloaderImpl: (1 skipped) Failed to download manifest. Status code: 404 URI: https://archive.cloudera.com/accumulo-c6/parcels/latest/manifest.json/

The correct link is https://archive.cloudera.com/accumulo/parcels/latest/

 Either correct this link in Parcel configuration or just remove this Accumulo parcel link and try again. 


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

@GangWar I removed the link from configuration as I am not using accumulo. The download error is gone but the errors still persists.

Super Collaborator

Hi @ankesh_clo,

 

Could you please share outputs from below three commands on the host? This will help us to find out if there are any permissions issues etc.

ls -altr /var/lib/cloudera-scm-agent

and

ls -altr /opt/cloudera

and

ls -altr /opt/cloudera/parcels

 

Thanks,

Li

Li Wang, Technical Solution Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Learn more about the Cloudera Community:

Terms of Service

Community Guidelines

How to use the forum

Explorer

Hi @lwang 

Thanks for responding.

 

Ran this commands on the new host I am trying to add. Here are the outputs in order:

 

ls -altr /var/lib/cloudera-scm-agent

total 36
drwxr-xr-x 46 root root 4096 Apr 21 22:41 ..
-rw-r--r-- 1 root root 36 Apr 21 22:41 uuid
-rw-r--r-- 1 root root 36 Apr 21 22:41 cm_guid
-rw------- 1 root root 14575 May 5 02:19 response.avro
drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 May 5 02:19 .
-rw------- 1 root root 2 May 6 03:09 active_parcels.json

 

ls -altr /opt/cloudera

total 40
drwxr-xr-x 3 root root 4096 Apr 21 22:40 ..
drwxr-xr-x 2 root root 4096 Apr 21 22:41 parcel-cache
drwxr-xr-x 6 cloudera-scm cloudera-scm 4096 Apr 21 22:41 .
drwxr-xr-x 10 root root 4096 Apr 27 02:59 cm-agent
drwxr-xr-x 27 root root 20480 Apr 27 02:59 cm
drwxr-xr-x 3 root root 4096 May 4 02:03 parcels

 

ls -altr /opt/cloudera/parcels

drwxr-xr-x 6 cloudera-scm cloudera-scm 4096 Apr 21 22:41 ..
drwxr-xr-x 2 cloudera-scm cloudera-scm 4096 May 4 02:03 .flood
drwxr-xr-x 3 root root 4096 May 4 02:03 .

 

Please have a look. Thanks!

@ankesh_clo Can you delete below files and try to restart the agent.

-rw------- 1 root root 14575 May 5 02:19 response.avro
-rw------- 1 root root 2 May 6 03:09 active_parcels.json

 


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

Explorer

@GangWar I did what you asked and i get this on the agent:

 

[11/May/2020 05:22:34 +0000] 6495 MainThread agent INFO To override these variables, use /etc/cloudera-scm-agent/config.ini. Environment variables for CDH locations are not used when CDH is installed from parcels.
[11/May/2020 05:22:36 +0000] 6495 MainThread supervisor INFO Trying to connect to supervisor (Attempt 1)
[11/May/2020 05:22:36 +0000] 6495 MainThread supervisor INFO Supervisor version: 3.0, pid: 1614
[11/May/2020 05:22:36 +0000] 6495 MainThread supervisor INFO Successfully connected to supervisor
[11/May/2020 05:22:36 +0000] 6495 MainThread agent INFO Supervisor version: 3.0, pid: 1614
[11/May/2020 05:22:36 +0000] 6495 MainThread agent INFO Connecting to previous supervisor: agent-1614-1589173066.
[11/May/2020 05:22:38 +0000] 6495 MainThread supervisor INFO Triggering supervisord update.
[11/May/2020 05:22:38 +0000] 6495 MainThread _cplogging INFO [11/May/2020:05:22:38] ENGINE Bus STARTING
[11/May/2020 05:22:38 +0000] 6495 MainThread _cplogging INFO [11/May/2020:05:22:38] ENGINE Started monitor thread '_TimeoutMonitor'.
[11/May/2020 05:22:38 +0000] 6495 MainThread _cplogging INFO [11/May/2020:05:22:38] ENGINE Serving on http://127.0.0.1:9001
[11/May/2020 05:22:38 +0000] 6495 MainThread _cplogging INFO [11/May/2020:05:22:38] ENGINE Bus STARTED
[11/May/2020 05:22:40 +0000] 6495 MainThread daemon INFO New monitor: (<cmf.monitor.host.HostMonitor object at 0x7f46e2d12ed0>,)
[11/May/2020 05:22:40 +0000] 6495 MonitorDaemon-Scheduler daemon INFO Monitor ready to report: ('HostMonitor',)
[11/May/2020 05:22:40 +0000] 6495 MainThread agent INFO Setting default socket timeout to 45
[11/May/2020 05:22:40 +0000] 6495 MainThread agent INFO Failed to read available parcel file: [Errno 2] No such file or directory: '/var/lib/cloudera-scm-agent/active_parcels.json'
[11/May/2020 05:22:40 +0000] 6495 MainThread agent INFO Loading last saved hb response to complete initialization: /var/lib/cloudera-scm-agent/response.avro
[11/May/2020 05:22:40 +0000] 6495 Monitor-HostMonitor network_interfaces INFO NIC iface ens5 doesn't support ETHTOOL (95)
[11/May/2020 05:22:40 +0000] 6495 MainThread heartbeat_tracker INFO HB stats (seconds): num:1 LIFE_MIN:0.02 min:0.02 mean:0.02 max:0.02 LIFE_MAX:0.02
[11/May/2020 05:22:40 +0000] 6495 MainThread agent INFO CM server guid: 513d3669-b5a8-49c0-863a-c0396dff5c7b
[11/May/2020 05:22:40 +0000] 6495 MainThread agent INFO Using parcels directory from server provided value: /opt/cloudera/parcels
[11/May/2020 05:22:40 +0000] 6495 MainThread parcel INFO Agent does create users/groups and apply file permissions
[11/May/2020 05:22:40 +0000] 6495 MainThread downloader INFO Downloader path: /opt/cloudera/parcel-cache
[11/May/2020 05:22:40 +0000] 6495 MainThread parcel_cache INFO Using /opt/cloudera/parcel-cache for parcel cache
[11/May/2020 05:22:40 +0000] 6495 MainThread throttling_logger WARNING Failed parsing alternatives line: rename string index out of range link best version is /usr/bin/file-rename
[11/May/2020 05:22:40 +0000] 6495 MainThread agent INFO Flood daemon (re)start attempt
[11/May/2020 05:22:42 +0000] 6495 MainThread firehoses INFO Reporting interval updated: 5.0 -> 60
[11/May/2020 05:22:42 +0000] 6495 MainThread agent ERROR Failed to handle Heartbeat Response: {u'firehoses': [{u'rol [big response....]

-----------------------

Traceback (most recent call last):
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1528, in handle_heartbeat_response
self._handle_heartbeat_response(response)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1661, in _handle_heartbeat_response
self._update_parcel_activation_state(response)
File "/opt/cloudera/cm-agent/lib/python2.7/site-packages/cmf/agent.py", line 1572, in _update_parcel_activation_state
manage_old_parcels = old_response.get("create_parcel_symlinks")
AttributeError: 'NoneType' object has no attribute 'get'

 

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.