Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

backup namode and datanodes

avatar
Super Collaborator

Hi:

Which directories from my machines i need to make a diary or mensual backup??? i mean /usr/hdp or /var/log etc etc????

Many thanks

1 ACCEPTED SOLUTION

avatar
Contributor
Alongside configuring NN disks for RAID, it is recommended to back up the following to safeguard your cluster from failures:
  • HDFS data (Can be done using Falcon)
  • Hive data (Can be done using Falcon)
  • HBase data (Setup HBase Cluster Replication)
  • Hive metadata (Can be done using Falcon between clusters. Also setup underlying metastore database in HA / active-active mode within the cluster)
  • Regular backup of databases used by Ambari, Oozie, Ranger
  • Configurations
    • Ambari Server and Agent configurations (Ambari folders under /etc and /var)
    • Configuration files for each application or service under /etc directory
    • Binaries (/usr/hadoop/current)
    • Any OS level configuration changes at each node level made in the cluster

View solution in original post

4 REPLIES 4

avatar

@Roberto Sancho

Yes, We can take NN & DN Configurations backup

/etc/hadoop/conf --for hadoop configuration files

/usr/hdp/2.3.4.0-3485 --ref for all the jars like sqoop jars etc

avatar
Super Collaborator

thanks:

just it ?? I dont need anything else from HDP?? and for the centos OS??

avatar

Is NN OS RAID configured?if yes, then i don't think it's required any additional backup.

Just FYI on what we are doing:

1) Our Master Nodes(NN & RM) are RAID configured

2) Taking DB ( ambari,Hue,OOzie etc) backup regularly.

3) HDFS Data backup using Falcon/distcp/Snapshot

4) /etc/hadoop/conf

5) /usr/hdp/2.3.4.0-3485/

I hope this will help:)

avatar
Contributor
Alongside configuring NN disks for RAID, it is recommended to back up the following to safeguard your cluster from failures:
  • HDFS data (Can be done using Falcon)
  • Hive data (Can be done using Falcon)
  • HBase data (Setup HBase Cluster Replication)
  • Hive metadata (Can be done using Falcon between clusters. Also setup underlying metastore database in HA / active-active mode within the cluster)
  • Regular backup of databases used by Ambari, Oozie, Ranger
  • Configurations
    • Ambari Server and Agent configurations (Ambari folders under /etc and /var)
    • Configuration files for each application or service under /etc directory
    • Binaries (/usr/hadoop/current)
    • Any OS level configuration changes at each node level made in the cluster
Labels