- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How many NameNodes needed on Backup Cluster
- Labels:
-
Apache Falcon
-
Apache Hadoop
Created ‎02-11-2016 03:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The customer wants to setup a backup to a backup cluster. They want to know how many NameNodes are required on the Backup Cluster. They want to know if there is some rule of thumb. The constraints for the backup cluster is below
- Constraints
- Backup window: 1am-5am ET (12am-4am CT)
- Want to have a yarn job that runs Falcon backup jobs in another queue when the cluster isn’t busy - that way it doesn't pile up for the night.
- The clusters will be in Amazon, so they want the backup cluster to be backed by Amazon S3 storage (being used as HDFS)
Created ‎02-11-2016 06:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @mhendricks you should use 2 Namenodes to increase the availability of the backup cluster. This is especially important if you only have limited window for your backup (like 1-5am).
This might also be helpful:
https://community.hortonworks.com/questions/4135/what-to-backup-and-how-only-metadata-not-data.html
Created ‎02-11-2016 03:40 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
2 namenode
Active and passive setup
I would treat backup cluster setup "not config" almost same as production
Configs = CPU & Memory
Created ‎02-11-2016 03:47 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The overall flows looks good.
Created ‎02-11-2016 06:12 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @mhendricks you should use 2 Namenodes to increase the availability of the backup cluster. This is especially important if you only have limited window for your backup (like 1-5am).
This might also be helpful:
https://community.hortonworks.com/questions/4135/what-to-backup-and-how-only-metadata-not-data.html
Created ‎02-11-2016 08:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
+1 cost savings are great but not with backups
