CM machine crashes, re-install and re-wire to the same cluster

Hi there,


We have a situation where the whole cluster was installed and managed by CM6/CDH6, 1 machine for CM, 4 other machines for CDH.  It runs well but then the CM machine crashed due to hardware failure.  It there a way to replace the hardware and reinstall teh same version of CM and re-wire to the same cluster again?


I understand that to prevent this from happening again, it is better to configure the CM as HA.  But reading the document, it seems that to make this one CM machine HA, we have to set up an HA database, an HA NFS mount, an HA proxy server, in addition to split that one machine into 4 machines.  It is acutally much more involved than making CDH cluster (master and datanodes) HA.  Plus we don't have resource to set up HA NFS and HA database...


If only there is a way to re-install the CM machine after it crashes, and be able to re-wire the machien to an existing cluster that is previously installed/managed by the same version of CM, it will be sufficient for us.


Any suggestion? 

Re: CM machine crashes, re-install and re-wire to the same cluster

It depends where your metadata was stored. Between CM and the rest of the cluster there is no magic, so "rewire" means you just point your cloudera agents to the new CM host. The most important thing is to restore your SCM database.