We are facing a critical issue that we lost our Cloudera Manager configuration DB (It is a separate Oracle DB) and Kerberos server, so we cannot start Cloudera Manager server now but I think HDSF service should be still there, we can update the config to disable use Kerberos and start HDSF service to find back our data.
When I try to start the HDSF cluster through command line manually, found that we don’t have service “hadoop-hdfs-namenode”, “hadoop-hdfs-datanode” and other “hadoop-hdfs-*” in init.d directory, we only have “cloudera-scm-agent” and “cloudera-scm-server” in init.d:
[root@inthdpname01 init.d]# ll
-rwxr-xr-x 1 root root 8594 Jul 7 2017 cloudera-scm-agent
-rwxr-xr-x 1 root root 8436 Jul 7 2017 cloudera-scm-server
-rw-r--r--. 1 root root 13948 Sep 16 2015 functions
-rwxr-xr-x 1 root root 9972 Jan 21 2012 jexec
-rwxr-xr-x. 1 root root 2989 Sep 16 2015 netconsole
-rwxr-xr-x. 1 root root 6630 Sep 16 2015 network
-rw-r--r-- 1 root root 59 May 13 2016 output
-rw-r--r--. 1 root root 1160 Nov 20 2015 README
-rwxr-xr-x. 1 root root 41724 May 4 2016 vmware-tools
Would like to seek help that:
If Cloudera Manager is managing your cluster, it is needed to start your services.
This is because Cloudera Manager creates all the configuration files for your services and then uses shell scripts to start the process. Without Cloudera Manager, you will not have an up-to-date set of configuration files for the process. In essence, there is no "easy" way to start your HDFS service.
If you lost your Cloudera Manager databse and have no means for recovery, it is possible to create a new one, but it will require a gread deal of work. You would need to add the same services and roles that are on each host. The agents should still be heartbeating to Cloudera Manager, so CM should allow you to select those Managed Hosts and add them to your cluster. You would need to make sure all your HDFS disk paths are correct and add back any configuration items as you had them before.
For a frame of reference, you can look at existing process directories. When CM starts a process it will lay down the configuration files in
You can look at all the configuration files for each Role (process) there to guide you.
If you are installing Cloudera Manager on a new host, you will need to update the "server_host" configuration in the /etc/cloudera-scm-agent/config.ini file for ayour agents.
If you really need to get the cluster running right away before you can start rebuilding the Cloudera Manager information, let us know. There is is a way to do it from the command line, but it is a bit tricky and unproven.
Thank you very much for your reply and advice. Now we know that CM is necessary to start the CDH because it is managed by CM.
Unfortunately, our namenodes (2) and datanodes (4) hosts have been rebooted, so I cannot get back the processes managed by agents from “/var/run/cloudera-scm-agent/process”.
Now I am trying to install a new CM, using a new set of hosts to simulate the situation about the existing CDH cluster, my plan (I read a post like this: https://community.cloudera.com/t5/Cloudera-Manager-Installation/cloudera-manager-database-lost/td-p/...) is that after I install the new CM:
1. I extract the hosts and roles configuration as json file;
2. Update the json file host name as the existing CDH hosts;
3. Import the json file into CM;
4. Update all /etc/cloudera-scm-agent/config.ini server_host to the new CM;
One challenge for me is that we cannot remember all the roles in each host clearly, because we installed all services (include HDFS, Hbase, Hive, Impala, Zookeeper, Spark, Hue…etc.) before, but for HDFS and Hbase should be ok since our host name is like namenode01, namenode02, datanode01, datanode02, datanode03, datanode04…
Would like to seek the help that:
1. In the new install CM, if some roles config are not match the exiting CDH, any impact? Especially to the data;
2. Can I install the new CM only and point the host into the existing CDH during the install process, I am afraid the new CM will re-install the CDH and have some impact the existing ones;
BTW, it would be very appreciated if you can share the way to start HDFS without CM as your mentioned before.