Posts: 48
Registered: ‎09-20-2017
Accepted Solution

Setup a CDH cluster for BDR purposes



I have a kerberized + Sentry protected CDH cluster with:


1 x Edge

2 x Master

4 x Worker


nodes. I want to setup a secondary cluster for Hive replication purposes.


1. What should be the minimum topology for this task?

2. Should the secondary cluster be Sentry protected as well?

3. Should the 2 cluster share the same KDC principals? If so, can the secondary cluster use the KDC server currently installed on Master1 node?


Thank you,




Posts: 430
Registered: ‎07-01-2015

Re: Setup a CDH cluster for BDR purposes

If you want to create a backup cluster, just for backup purposes, you can skip the Sentry. Number of nodes would be min 3. You can use the same KDC for the backup cluster (if those two cluster are on the same network - i.e. there is no overlap on the hostnames and IP addresses), but make sure you follow the configurations for distcp between kerberized environments. You dont need the edge node as well.
Posts: 48
Registered: ‎09-20-2017

Re: Setup a CDH cluster for BDR purposes

Thank you @Tomas79


I am also searching for architecture designs for Active-Active or Active-Passive DR configurations using 2 clusters. This article  has some introductory info on this. I was wondring whether more resources are available on this topic.


Best regards,


Posts: 1,096
Topics: 1
Kudos: 278
Solutions: 132
Registered: ‎04-22-2014

Re: Setup a CDH cluster for BDR purposes



What are your goals for your failover or backup strategy?


BDR schedules only replicate on a schedule in one direction.

"Active-Active" concepts may not truly apply to CDH depending on what you mean by that.