Support Questions

Find answers, ask questions, and share your expertise

Cluster 'back-up' - does it make sense ?

avatar
Super Collaborator

We have recently installed HDP 2.4 using Ambari 2.2.2.0.

One of the 'requirements' from the business(who I am sure don't understand the difference between traditional RDBMS etc. and Hadoop) and from the infrastructure and Linux guys is 'provide the strategy and steps to back-up(?) the cluster'.

I am sure it doesn't make any sense to back-up the cluster data which will run in petabytes(correct me if I am wrong).

Now we are left with the cluster 'metadata'. On the Internet, I came across several posts and HDP doc. this , this and this suggesting backing up.

I have several thoughts and confusions regarding it :

  1. Is it a real-world/sensible practice to back-up the metadata ? Can it be achieved regularly without the cluster down-time ? Any relevant documentation ?
  2. I am unable to understand how to 'back-up' the HDFS checkpoints/snapshots - can anyone explain their significance in case the cluster has to be restored ?
  3. Suppose one or all DNs go down/One or all NNs go down - does the back-up metadata help in any case now ?

Overall, I am confused with the sanity of the 'back-up' concept in Hadoop and practical steps to do it AND use it(but when ?)

1 ACCEPTED SOLUTION

avatar
Super Guru

@Kaliyug Antagonist Hi

I would disagree with your assumption that it doesn't make sense to backup peta bytes of data. Think what would you do if there is a fire in a data center and your data is physically destroyed. So even at Petabyte scale, it is very important to have a backup and DR strategy.

Now, snapshots only create backups of data for point in time. You can mark a directory "snapshottable" and then create snapshots of data in that directory. This will give you the ability to go back in time and restore the data to that particular point in time. Please see the following link for more details:

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html

Snapshots still don't solve the problem you are trying to solve. You need to backup data using either distcp or use tools like Falcon to help. Please see the following link.

https://community.hortonworks.com/questions/394/what-are-best-practices-for-setting-up-backup-and.ht...

http://hortonworks.com/apache/falcon/

As for your question number 3, when your data nodes go down or name node goes down, I don't think your backups help. When a data node goes down, Hadoop will take care of creating the lost copy by replicating the data. Also, someone in operations will likely be working to bring the datanode up. Similarly if your name node goes down, your cluster should failover to standby namenode and your operations team should be working to restore the lost namenode. Backing up metadata doesn't help in this particular case because between namenode and standby name node you have a quorum journal manager you already have multiple copies of data (this does not discount the significance of a backup and DR strategy which includes metadata backup). Please check the following link. It will help you understand better on how this is working.

http://hortonworks.com/blog/namenode-high-availability-in-hdp-2-0/

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_hadoop-ha/content/ch_HA-NameNode.html (If you are interested in learning more details)

Thanks

Imad

View solution in original post

3 REPLIES 3

avatar

@Kaliyug Antagonist this is a large topic but I think I can provide some context and links that may help you research further to address some of your confusions. Please note it's important to distinguish service availability and disaster recovery/business continuity concerns as far as the various solutions in play.

Regarding the three links you’ve included, the context here is an upgrade of HDP where these backups are important should the upgrade need to be reverted (which I believe to be a narrower scope than the broader concern your infrastructure/Linux folks are raising).

There are two distinct notions of metadata at play here: 1) the NameNode metadata contained in the fsimage and edits files (see http://hortonworks.com/blog/hdfs-metadata-directories-explained/) and 2) the metadata associated with specific Hadoop services (such as Ambari, Hive, etc.)

For 2), it is recommended to store the metadata for all services in an external RDBMS, in a database that has an associated backup schedule defined. For example, this HCC post contains some answers about backing up Hive metadata, as well as some other DR considerations which may be of interest.

As far as the NameNode metadata itself, keep in mind that Normal NN failures are handled by the Standby NameNode (it is recommended to configure the NameNode as a high availability service, with a Standby NameNode, see https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.0/bk_hadoop-ha/content/ch_HA-NameNode.html

In the case that both NNs fail, it is important to have a backup of the most recent copy of the fsimage data. The fsimage file contains the complete state of the file system at a point-in-time. Since HDFS metadata is stored in RAM for performance reasons, a durable copy of the fsimage is required in scenarios in which both NameNode services fail in order to restore the cluster (this HCC post contains some further information).

The failure of a DN is not an important consideration in the context of your question, since DNs do not store HDFS metadata and block resiliency is built into HDFS via replication of blocks to other data nodes. The NN will coordinate re-replication of blocks as required to maintain the configured level of replication.

It’s important to distinguish checkpoints of HDFS metadata from HDFS Snapshots. Snapshots can be used to protect important enterprise data sets from user or application errors, similar to a traditional database backup. See http://hortonworks.com/hadoop-tutorial/using-hdfs-snapshots-protect-important-enterprise-datasets/

The business continuity/disaster recovery strategy for the HDFS data itself is usually covered by a replication strategy to a backup/DR cluster, and the Apache Falcon project is intended to make this easier. See http://hortonworks.com/hadoop-tutorial/mirroring-datasets-between-hadoop-clusters-with-apache-falcon... for some introductory documentation.

avatar
Super Guru

@Kaliyug Antagonist Hi

I would disagree with your assumption that it doesn't make sense to backup peta bytes of data. Think what would you do if there is a fire in a data center and your data is physically destroyed. So even at Petabyte scale, it is very important to have a backup and DR strategy.

Now, snapshots only create backups of data for point in time. You can mark a directory "snapshottable" and then create snapshots of data in that directory. This will give you the ability to go back in time and restore the data to that particular point in time. Please see the following link for more details:

https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-hdfs/HdfsSnapshots.html

Snapshots still don't solve the problem you are trying to solve. You need to backup data using either distcp or use tools like Falcon to help. Please see the following link.

https://community.hortonworks.com/questions/394/what-are-best-practices-for-setting-up-backup-and.ht...

http://hortonworks.com/apache/falcon/

As for your question number 3, when your data nodes go down or name node goes down, I don't think your backups help. When a data node goes down, Hadoop will take care of creating the lost copy by replicating the data. Also, someone in operations will likely be working to bring the datanode up. Similarly if your name node goes down, your cluster should failover to standby namenode and your operations team should be working to restore the lost namenode. Backing up metadata doesn't help in this particular case because between namenode and standby name node you have a quorum journal manager you already have multiple copies of data (this does not discount the significance of a backup and DR strategy which includes metadata backup). Please check the following link. It will help you understand better on how this is working.

http://hortonworks.com/blog/namenode-high-availability-in-hdp-2-0/

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.3.2/bk_hadoop-ha/content/ch_HA-NameNode.html (If you are interested in learning more details)

Thanks

Imad

avatar

Hi @Kaliyug Antagonist. The answers above from @slachterman and @mqureshi are excellent.

Here is another way (at a higher-level) to look at this problem. Here are some tips to plan out a DR strategy for the smoldering datacenter problem that is mentioned above.

1. Use the term Disaster Recovery instead of Backup. This gets the administrators to move away from the RDBMS-like idea that they can simply run a script and recover the entire cluster.

2. Discuss RTO/RPO and let the business answers drive the architecture. RTO and RPO requirements need to be defined by the Business - these requirements drive all decisions around Disaster recovery

A 1-hour/1-hour RTO/RPO is wildly different (cost and architecture) from a 2-week/1-day RTO/RPO. When they choose the RTO/RPO requirements they are also choosing the required cost & architecture.

By having well-defined RTO/RPO requirements you will avoid having an over-engineered solution (which may be far too expensive) and will also avoid having an under-engineered solution (which may fail precisely when you need it most - during a Disaster event)

3.'Band’ your data assets into different categories for RTO/RPO purposes.

Example: Band 1 = 1 hour RTO. Band 2 = 1 day RTO. Band 3 = 1 week RTO, Band 4 = 1 month RTO, Band 5 = Not required in the event of a disaster

You would be surprised how much data can wait in the event of a SEVERE crash. For example, datasets that are used to provide a report that is distributed once per month - they should never require a 1-hour RTO.

Hope that helps.