Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

cloudera manager database lost

avatar
Explorer

Hi:

    yesterday, I found my cloudera manager machine was format by my colleague, I used cloudera manager manage two cluster for production use,

, and I did not back up my scm database before. I try to recover my scm database , I think I can restore  hosts,roles roles_config_groups table, however

 I found it's very hard to restore config table. 

 

 so ,I have two questions:

1 is there any way to recovery my scm database,especially configs and client_configs table?

2 if I can not recovery my cloudera manager, how can I manager my cluster maually ?

 

any documentation will be helpful. please help, thanks.

 

I use cdh manager 5.3.1 , and cdh 2.5.0 .

 

 

 

 

 

 

1 ACCEPTED SOLUTION

avatar
Explorer

 

Finally, I get it done, so ,I post my steps  ,maybe it will be helpful for someone like me who happen to  have the same problem.

 

I use cdh5.3.1,the main steps is : At first ,I recreate a new cdh manager, and reconfigure all parameters and roles in this new cdh manager, t

hen add all process_id in processes table in scm db, and then modify /etc/cloudera-scm-agent/config.ini  server_host to this new manager

and restart all agent .

 


At first, we should backup ,prepare for the worst.

1 cdh provided two ways to backup,backup database or backup config to  a json file

 1.1 backup database :

    http://www.cloudera.com/content/www/en-us/documentation/enterprise/5-2-x/topics/cm_ag_backup_dbs.htm...

   for example:

       backup :
   pg_dump -h localhost -p 7432 -U scm -W -F c -b -v -f "scm_db.db" scm
   restore:
   pg_restore  -p 7432 -U scm -W -d scm -v scm_db.db
 1.2 write config to a json file
  http://www.cloudera.com/content/www/en-us/documentation/enterprise/5-3-x/topics/cm_intro_api.html#xd...
  for example:
   export:
   curl -u admin:admin "http://localhost:8888/api/v9/cm/deployment" > ~/cmf_config.json
   import:
   curl --upload-file ~/cmf_config.json -u admin:admin http://localhost:8888/api/v9/cm/deployment?deleteCurrentDeployment=true
 
 
If we did not backup and lost our database,well,it is hard to restore, however ,it can be done.
 
The following is my steps:
  1 Reinstall a new cdh manager on another machine with different host name.
 
  2 Export a json configuration file from another currently working cdh manager with same cdh manager version.
    This cdh manager should include all service(such as hdfs,yarn,hbase,hdfs HA ,etc).
 
  3 Run a script on all machine to get hostid and hostname, the hostid is in a file:
    /var/lib/cloudera-scm-agent/uuid
 
  4 Modify the json file from step 2,do the following:
    4.1 Delele all hosts in this json file and add all hosts's hostid and hostname from step 3.
    4.2 Delete all roles in clusters's services.
    4.3 Modify cluster name to old cluster id. you can get the old cluster id from hdfs namenode http webpage
      if you do not remember you old cluster id.
 
  5 Import the new json file into the new create cdh manager.
 
  6 Reconfigure all service's parameter and add all instance to service as before.(you should not use host
    template cause the agent is not report this new cmf server,however, you can add instances from service ).
 
  7 If you have hdfs HA enabled, you have to export json file and add hdfs ha roles in that file, and import to
    it again.
 
 
  8 If you do not want to stop all the services, you have to do the following steps to get all process_id from
    all hosts, however ,if you can stop services, you can jump to step 11;
 
  9 Run the following script on all hosts to get process_id and services name.
   
grep "spawned:.*with pid" /var/log/cloudera-scm-agent/supervisord.log  |awk -vhost=$HOSTNAME '{ idx=index($5,"-"); name=substr($5,idx+1,length($5)-idx-1);pid=substr($5,2,idx-2);cc[name]=pid;}END{a=host;for (b in cc){ a=a"\t"b"\t"cc[b]} print a}'
 
   10  Parese that file and insert a record in scm database'  processes table , before we insert to processes table, you have to insert another record in commands
         table to get a new command_id .( this step is hard).
 
   11 Modify all /etc/cloudera-scm-agent/config.ini  server_host to new cdh manager , kill cmf listener and restart all cmf agent.
 
 
   Everything should be fine , the agent will report to new cdh server ,and the cdh server will return the same process_id to agent, so the running process

will not be killed.

  
  

View solution in original post

2 REPLIES 2

avatar
Mentor

avatar
Explorer

 

Finally, I get it done, so ,I post my steps  ,maybe it will be helpful for someone like me who happen to  have the same problem.

 

I use cdh5.3.1,the main steps is : At first ,I recreate a new cdh manager, and reconfigure all parameters and roles in this new cdh manager, t

hen add all process_id in processes table in scm db, and then modify /etc/cloudera-scm-agent/config.ini  server_host to this new manager

and restart all agent .

 


At first, we should backup ,prepare for the worst.

1 cdh provided two ways to backup,backup database or backup config to  a json file

 1.1 backup database :

    http://www.cloudera.com/content/www/en-us/documentation/enterprise/5-2-x/topics/cm_ag_backup_dbs.htm...

   for example:

       backup :
   pg_dump -h localhost -p 7432 -U scm -W -F c -b -v -f "scm_db.db" scm
   restore:
   pg_restore  -p 7432 -U scm -W -d scm -v scm_db.db
 1.2 write config to a json file
  http://www.cloudera.com/content/www/en-us/documentation/enterprise/5-3-x/topics/cm_intro_api.html#xd...
  for example:
   export:
   curl -u admin:admin "http://localhost:8888/api/v9/cm/deployment" > ~/cmf_config.json
   import:
   curl --upload-file ~/cmf_config.json -u admin:admin http://localhost:8888/api/v9/cm/deployment?deleteCurrentDeployment=true
 
 
If we did not backup and lost our database,well,it is hard to restore, however ,it can be done.
 
The following is my steps:
  1 Reinstall a new cdh manager on another machine with different host name.
 
  2 Export a json configuration file from another currently working cdh manager with same cdh manager version.
    This cdh manager should include all service(such as hdfs,yarn,hbase,hdfs HA ,etc).
 
  3 Run a script on all machine to get hostid and hostname, the hostid is in a file:
    /var/lib/cloudera-scm-agent/uuid
 
  4 Modify the json file from step 2,do the following:
    4.1 Delele all hosts in this json file and add all hosts's hostid and hostname from step 3.
    4.2 Delete all roles in clusters's services.
    4.3 Modify cluster name to old cluster id. you can get the old cluster id from hdfs namenode http webpage
      if you do not remember you old cluster id.
 
  5 Import the new json file into the new create cdh manager.
 
  6 Reconfigure all service's parameter and add all instance to service as before.(you should not use host
    template cause the agent is not report this new cmf server,however, you can add instances from service ).
 
  7 If you have hdfs HA enabled, you have to export json file and add hdfs ha roles in that file, and import to
    it again.
 
 
  8 If you do not want to stop all the services, you have to do the following steps to get all process_id from
    all hosts, however ,if you can stop services, you can jump to step 11;
 
  9 Run the following script on all hosts to get process_id and services name.
   
grep "spawned:.*with pid" /var/log/cloudera-scm-agent/supervisord.log  |awk -vhost=$HOSTNAME '{ idx=index($5,"-"); name=substr($5,idx+1,length($5)-idx-1);pid=substr($5,2,idx-2);cc[name]=pid;}END{a=host;for (b in cc){ a=a"\t"b"\t"cc[b]} print a}'
 
   10  Parese that file and insert a record in scm database'  processes table , before we insert to processes table, you have to insert another record in commands
         table to get a new command_id .( this step is hard).
 
   11 Modify all /etc/cloudera-scm-agent/config.ini  server_host to new cdh manager , kill cmf listener and restart all cmf agent.
 
 
   Everything should be fine , the agent will report to new cdh server ,and the cdh server will return the same process_id to agent, so the running process

will not be killed.