Support Questions

Find answers, ask questions, and share your expertise

how to recover HDP cluster by installing HDP from scratch and save data-node disks and kafka disks data

avatar

hi all

 

 

I want to ask this important question

 

lets say we have the following:

 

HDP cluster with  :

 

3 masters machine ( active/standby name-node ) , ( active/standby resource manager )

3 datanode machines 

   - each data-node machine have 4 disks for HDFS ( not include the OS )

3 kafka machines

   - each kafka machine have one disk of 10T ( not include the OS )

 

now we want to install from scratch all cluster include HDP and ambari

 

but save the data on datanode machines and kafka topics data machine by the following:

 

we umount the disks on datanode machines and kafka machines  

 

 

example

 

on datanode machine ( note - /etc/stab is already configured )

 

umount /grid/data1

umount /grid/data2

.

.

.

 

 

so the second scratch installation we install all the cluster ( by blueprint ) , but without data-node HDFS disks , and kafka  topic disk ( scratch installation means fresh new linux OS )

 

after installation we mount all data-node machines disks and kafka disks machines ( where we are store all topics ) 

example

 

on datanode machine ( note - /etc/stab is already configured )

 

mount /grid/data1

mount /grid/data2

.

.

.

 

 

in order to complete the picture , need to restart  HDFS and YARN and kafka

 

so - is this scenario could to work ?

 

Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Expert Contributor

@mike_bronson7  you just need to backup /hadoop/hdfs/namenode/current from active namenode

 

Also if you backup one week earlier the activity and lets say your first cluster is going serve more request to clients then you will loose that data which was written after backup.

So best is to do savenamespace and backup when you are going to do activity and freeze clients not accessing the cluster.

 

View solution in original post

7 REPLIES 7

avatar
Expert Contributor

@mike_bronson7 

 

What i got from your scenario is on second scratch installation your master nodes [ie. active/standby name-node ] are fresh installed and you are only adding the datanodes which are having pre-existing data [from other cluster]..right!!

-- In this case its not possible to get the cluster up with new data from the HDD which was restored.

Since namenode will not have any information about the blocks lying in blockstorage on the datanode disk.

 

If you have opted support from Cloudera then you can approach them for DR scenario where they can help you to get existing data from datanodes to be added back in cluster[not sure if it can be recovered/added back 100%]

Same for kafka.

avatar

ok 

do you can summarize all options to recover the namenode ( also option out from the box )

Michael-Bronson

avatar
Expert Contributor

1. if you can backup metadata drom orignal cluster(where datanode were existing at first) and copy that metadata to new cluster then thats the best option.

2. if you are not able to go with point 1, then probably you can try " hadoop namenode -recover" option.

 

below link might be useful

 https://blog.cloudera.com/understanding-hdfs-recovery-processes-part-1/

https://clouderatemp.wpengine.com/blog/2015/03/understanding-hdfs-recovery-processes-part-2/

 

avatar

about option one

 

I guess you not mean to backup the metadata by copy it as scp or rsync

maybe you means that there are tool for backup like barman for postgresql

so do you know tool for this option?

 

on each name nodes we have the following folders

 

/hadoop/hdfs/namenode/current ( when fsimage exists i

/hadoop/hdfs/journal/hdfsha/current/

 

do you means to backup only these folders , lets say every day ?

 

 

Michael-Bronson

avatar
Expert Contributor
  • Backup i mean, copy the namenode current directory only
  • first do safemode on and then save namespace. once both commands are executed take backup of namenode current directory from active node.
  • you can copy to destination/new cluster using any command (like scp) or tool. scp sill be simplest option.

avatar

since we have both current folders as:

 

/hadoop/hdfs/namenode/current ( when fsimage exists i

/hadoop/hdfs/journal/hdfsha/current/

 

do you mean to backup both them?

 

 

second how backup for time prescriptive

for example one week or more ? 

 

Michael-Bronson

avatar
Expert Contributor

@mike_bronson7  you just need to backup /hadoop/hdfs/namenode/current from active namenode

 

Also if you backup one week earlier the activity and lets say your first cluster is going serve more request to clients then you will loose that data which was written after backup.

So best is to do savenamespace and backup when you are going to do activity and freeze clients not accessing the cluster.