Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How can we keep the data in 2 data centers in sync ?

Solved Go to solution

How can we keep the data in 2 data centers in sync ?

Explorer

Hi Team,

 

We have 12 nodes cluster hosted on premise in 2 different regions.

The question is how can we keep the data in 2 data centers in sync and what will be the latency.

 

Appreciate your help.

 

Thanks & Regards

 

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: How can we keep the data in 2 data centers in sync ?

Expert Contributor

Hi @ARVINDR ,

 

I suggest you reach to Dell EMC for guidance around Isilon.

 

I found some documentation here:

 

https://www.dellemc.com/ro-ro/collaterals/unauth/white-papers/products/storage/h10588-isilon-data-av...

 

that suggests some techniques. I'll caveat that I'm not an expert in Isilon so my responses here are best endeavors.

 

Regards,

Steve

View solution in original post

8 REPLIES 8
Highlighted

Re: How can we keep the data in 2 data centers in sync ?

Expert Contributor

Hi @ARVINDR 

 

I'd like to clarify your scenario.  Do you have

 

A) 12 nodes in region A and 12 nodes in region B i.e. 2 distinct Cloudera clusters with a total of 24 nodes and you want to replicate data between these clusters?

 

or

 

B) 12 nodes split across regions A and B i.e. a single cluster of 12 nodes?

 

Also what version of Cloudera are you using?

 

Regards,

Steve

Highlighted

Re: How can we keep the data in 2 data centers in sync ?

Explorer

Hi Steven,

Pls consider scenario A

 

A) 12 nodes in region A and 12 nodes in region B i.e. 2 distinct Cloudera clusters with a total of 24 nodes and we want to replicate data between these clusters

 

We are using version HDP 3.0.1 

 

Thanks & Regards

 

Highlighted

Re: How can we keep the data in 2 data centers in sync ?

Expert Contributor

Hi @ARVINDR 

 

In this case, you would use Data Lifecycle Manager to replicate data between the two clusters.

 

Here is a link to the documentation:

 

https://docs.cloudera.com/HDPDocuments/DLM1/DLM-1.5.1/administration/content/dlm_hdfs_replication_ov...

 

The latency will be a function of your network. I can share some general networking guidelines here:

 

https://docs.cloudera.com/documentation/other/reference-architecture/topics/ra_bare_metal_deployment...

 

Regards,

Steve

Highlighted

Re: How can we keep the data in 2 data centers in sync ?

Explorer

Hi @StevenOD,

 

Thanks for the details.

Just one query, we are building Hadoop on top of Isilon , will the following still holds true in that case ?

 

Thanks & Regards

Arvind.

 

Highlighted

Re: How can we keep the data in 2 data centers in sync ?

Expert Contributor

Hi @ARVINDR ,

 

I am an expert in Isilon but I'm not sure that Data Lifecyle Manager (DLM) supports using Isilon as the storage layer. 

 

Isilon uses the OneFS file system. OneFS supports its own utilities for backing up data and replication so it might be better to use tools that are native to Isilon in this scenario.

 

Regards,

Steve

Highlighted

Re: How can we keep the data in 2 data centers in sync ?

Explorer

Thanks @steve

Could you pls suggest native tools in this scenario ?

 

Thanks & Regards

 

Highlighted

Re: How can we keep the data in 2 data centers in sync ?

Expert Contributor

Hi @ARVINDR ,

 

I suggest you reach to Dell EMC for guidance around Isilon.

 

I found some documentation here:

 

https://www.dellemc.com/ro-ro/collaterals/unauth/white-papers/products/storage/h10588-isilon-data-av...

 

that suggests some techniques. I'll caveat that I'm not an expert in Isilon so my responses here are best endeavors.

 

Regards,

Steve

View solution in original post

Highlighted

Re: How can we keep the data in 2 data centers in sync ?

Explorer

Thanks for sharing documents @StevenOD

Don't have an account?
Coming from Hortonworks? Activate your account here