Created 09-29-2016 11:27 PM
Can some one help me in what are the best practices and ways to do the Data replication in Prod cluster.
Created 09-30-2016 03:36 PM
Hi Suri,
@Suri Nuthalapati
Use falcon to mirror HDFS data.
Falcon can be used to mirror HIVE metadata.
Hope that helps!
Created 10-01-2016 04:20 PM
Hi Patel,
Thanks for your answer. Do you know any best practices for using Distcp. I am looking for best practices other than Falcon.
Thanks,
Suri
Created 10-01-2016 08:35 PM
Hi Suri,
With distcp you will not be able to replicate Hive metadata.Only HDFS data can be replicated!.
If you have any further question then feel free to ask.
If you like my answer then please select as best answer!
Thanks.
Created 10-01-2016 08:36 PM
Created 10-01-2016 09:10 PM
Hive Metastore replication
https://cwiki.apache.org/confluence/display/Hive/Replication
https://cwiki.apache.org/confluence/display/Hive/HiveReplicationDevelopment
Once all the metastores are in HBase
Created 10-02-2016 03:33 PM
Timothy, Thank you for your response. But I am looking for best ways to replicate HDFS also using Distcp.
Suri
Created 01-05-2017 07:52 PM
Distributed Copy (DistCP) http://hadoop.apache.org/docs/r2.7.3/hadoop-distcp/DistCp.html