I have a kerberized + Sentry protected CDH cluster with:
1 x Edge
2 x Master
4 x Worker
nodes. I want to setup a secondary cluster for Hive replication purposes.
1. What should be the minimum topology for this task?
2. Should the secondary cluster be Sentry protected as well?
3. Should the 2 cluster share the same KDC principals? If so, can the secondary cluster use the KDC server currently installed on Master1 node?
What are your goals for your failover or backup strategy?
BDR schedules only replicate on a schedule in one direction.
"Active-Active" concepts may not truly apply to CDH depending on what you mean by that.