Lost HDFS data when upgrade of CM + CDH


Lost HDFS data when upgrade of CM + CDH


Hi to all,


I upgraded from CM 5.1.2 to CM 5.3.2 also with the CDH upgrade using parcels (Cloudera Express, without internet). I was surprised that I was not able to use the data that I had in my older CDH 5.1.2 in HDFS in a new CDH parcel CDH 5.3.2. Is it possible how to transfer data from the old HDFS into a new version of CDH when upgrading? I haven't found any note about it when I followed the upgrade manual from the Cloudera site.


When I try to activate the old parcel 5.1.2 in a new version of CM 5.3.2, I failed during starting the services


Thank you for any reply or help!



Best regards,


Václav Surovec


Re: Lost HDFS data when upgrade of CM + CDH

Expert Contributor

What is the error message when you try upgrading?


I usually watch mine with:


tail -f /var/log/cloudera-*/*.log


Then I find the issue and resolve it accordingly.


Can you share the error logs please.

Re: Lost HDFS data when upgrade of CM + CDH




I haven't got any error message during the upgrade. I just had parcel installation in one folder and then another (new) parcel in another folder. When I finished the upgrade, I had to "install" all the services (HDFS, YARN...) again in the Cloudera Manager and in HDFS, during the setting of namenodes and datanodes there, I had to choose a non-empty folder for them.


I would like to know if or how I can transfer the data which I had in my old HDFS in an old parcel to a new parcel installation. There was nothing about it in the upgrade manual.


Thank you!

Re: Lost HDFS data when upgrade of CM + CDH

Super Collaborator

How did you perform the upgrade of CM?  What did you use to actually upgrade CM, yum?

Re: Lost HDFS data when upgrade of CM + CDH

Super Collaborator

Audit the contents of /etc/cloudera-scm-server/ and any copies of the file... generally speaking it sounds like you actually completely over-installed a second time, creating a new SCM db that has nothing deployed within it.


If you setup over the same paths re-installing everything, you should have recieved errors.


It should be possible to make a backup of the current and then fall back to the previous you need to give more detail.

DO NOT delete or remove any of your HDFS paths or rename them, leave that intact. 

Re: Lost HDFS data when upgrade of CM + CDH


First I performed CM update - I used tarball for it, so did not use yum (then I also updated java)

Then I was still getting some database errors when I tried to run the CM agent or server, so I run the prepare_database script, but I was told in some error log that "scm" table is not I deleted it and create a new "scm" that is probably exactly what you thought it would be


Then I was able to run with no errors:

sudo /opt/cloudera-manager/cm-5.3.2/share/cmf/schema/ mysql scm scm scm -uroot -proot


What should I have done instead? I am not able to give you the details about the error log from that time, but it was clear that I had to delete the "scm" table so I can continue...


I have everything backuped, all the tables and nodes data. I have already restored the data manually from other sources, but I would like to know what to do for the next time when upgrading. From the manual perspective, it doesn't seem to me as upgrade manual, but as the new installation manual, because I did all the steps that I did during the first installation :)


Thank you for your reply!