Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

backup namode and datanodes

avatar
Master Collaborator

Hi:

Which directories from my machines i need to make a diary or mensual backup??? i mean /usr/hdp or /var/log etc etc????

Many thanks

1 ACCEPTED SOLUTION

avatar
Rising Star
Alongside configuring NN disks for RAID, it is recommended to back up the following to safeguard your cluster from failures:
  • HDFS data (Can be done using Falcon)
  • Hive data (Can be done using Falcon)
  • HBase data (Setup HBase Cluster Replication)
  • Hive metadata (Can be done using Falcon between clusters. Also setup underlying metastore database in HA / active-active mode within the cluster)
  • Regular backup of databases used by Ambari, Oozie, Ranger
  • Configurations
    • Ambari Server and Agent configurations (Ambari folders under /etc and /var)
    • Configuration files for each application or service under /etc directory
    • Binaries (/usr/hadoop/current)
    • Any OS level configuration changes at each node level made in the cluster

View solution in original post

4 REPLIES 4

avatar

@Roberto Sancho

Yes, We can take NN & DN Configurations backup

/etc/hadoop/conf --for hadoop configuration files

/usr/hdp/2.3.4.0-3485 --ref for all the jars like sqoop jars etc

avatar
Master Collaborator

thanks:

just it ?? I dont need anything else from HDP?? and for the centos OS??

avatar

Is NN OS RAID configured?if yes, then i don't think it's required any additional backup.

Just FYI on what we are doing:

1) Our Master Nodes(NN & RM) are RAID configured

2) Taking DB ( ambari,Hue,OOzie etc) backup regularly.

3) HDFS Data backup using Falcon/distcp/Snapshot

4) /etc/hadoop/conf

5) /usr/hdp/2.3.4.0-3485/

I hope this will help:)

avatar
Rising Star
Alongside configuring NN disks for RAID, it is recommended to back up the following to safeguard your cluster from failures:
  • HDFS data (Can be done using Falcon)
  • Hive data (Can be done using Falcon)
  • HBase data (Setup HBase Cluster Replication)
  • Hive metadata (Can be done using Falcon between clusters. Also setup underlying metastore database in HA / active-active mode within the cluster)
  • Regular backup of databases used by Ambari, Oozie, Ranger
  • Configurations
    • Ambari Server and Agent configurations (Ambari folders under /etc and /var)
    • Configuration files for each application or service under /etc directory
    • Binaries (/usr/hadoop/current)
    • Any OS level configuration changes at each node level made in the cluster