Reply
Highlighted
New Contributor
Posts: 2
Registered: ‎10-21-2018
Accepted Solution

CM machine crashes, re-install CM on new machine and add existing Hosts(datanodes)

[ Edited ]

 

We have a situation where the whole cluster was installed and managed by CM6/CDH6, 1 machine for CM, 4 other machines for CDH, embedded DB is not use, mysql is deployed as external DB. It runs well but then the CM machine crashed due to hardware failure. It there a way to replace the hardware and reinstall teh same version of CM and add existing hosts(datanodes) to the same cluster again?

 

If only there is a way to re-install the CM machine after it crashes, and be able to add hosts machines to an existing cluster that is previously installed/managed by the same version of CM, it will be sufficient for us.

 

I tried to add existing hosts(datanodes) but installation stopped with below message at Cluster Installation -> Install Parcels
Src file /opt/cloudera/parcels/.flood/CDH-5.15.1-1.cdh5.15.1.p0.4-el6.parcel/CDH-5.15.1-1.cdh5.15.1.p0.4-el6.parcel does not exist

ParcelError.png

 

Any suggestion? am I doing right way, is there any othe correct way to achive this?

Posts: 945
Topics: 1
Kudos: 222
Solutions: 119
Registered: ‎04-22-2014

Re: CM machine crashes, re-install CM on new machine and add existing Hosts(datanodes)

@manjj,

 

If you lost your database and then reinstalled CM, the agents will not complete the heartbeat to the new CM since the cm_guid does not match the value in CM.

 

To correct this, on all hosts with agents running:

 

- # rm /var/lib/cloudera-scm-agent/cm_guid

- # service cloudera-scm-agent restart

 

I think the reason you are seeing those errors in the parcels page is because the agents are in bad health... due to the cm_guid.

 

the cm_guid is generated by CM and the agent stores it to make sure the agent does not communicate with a CM / database that is unexpeted.  The process of removing it will allow the agent to see that it should now accept communication with the new CM server/db that you have.

 

 

New Contributor
Posts: 2
Registered: ‎10-21-2018

Re: CM machine crashes, re-install CM on new machine and add existing Hosts(datanodes)

Thanks a lot, this works for me.
Announcements