Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hive Metadata replication solution other than Apache Falcon

Hive Metadata replication solution other than Apache Falcon

Explorer

Hello,

I would like to know if there is any other solution to Hive metadata replication across clusters other than Apache Falcon Hive Mirror.

1) How does Hive Mirroring by Falcon work internally (On a High level)?

2) Can the same be achieved by backup and restore of metastore DB on a different server?

3) How can HDFS storage dependencies for tables be managed in case of Metastore DB backup and restore?

Any help on the above questions is much appreciated.

Thanks & Regards,

Megh

5 REPLIES 5

Re: Hive Metadata replication solution other than Apache Falcon

Hi @Megh Vidani

Data Plane Service (DPS) and Data Lifecycle Manager (DLM) are new products announced by Hortonworks for Disaster Recovery and Backup. Replication will support HDFS data as well as Hive data and metadata.

https://fr.hortonworks.com/products/data-management/dataplane-service/

This product will be available very soon.

For your information, DLM uses the new event base feature of Hive that you can read on here :

https://cwiki.apache.org/confluence/display/Hive/HiveReplicationDevelopment

https://www.slideshare.net/SankarH1/disaster-recovery-and-cloud-migration-for-your-apache-hive-wareh...

I hope this helps

Highlighted

Re: Hive Metadata replication solution other than Apache Falcon

Explorer

@Abdelkrim Hadjidj Thanks for this info :)

Re: Hive Metadata replication solution other than Apache Falcon

Explorer

@Abdelkrim Hadjidj Any other existing solution or workaround for the same?

Re: Hive Metadata replication solution other than Apache Falcon

@Megh Vidani

You can always use the internal Hive mechanism with your own script for replication. For instance, Hive export store data and metadata on HDFS. You can use distcp to copy these data from one cluster to another. The new Hive replication features are also something to consider with an updated Hive version.

Re: Hive Metadata replication solution other than Apache Falcon

Explorer

@Abdelkrim Hadjidj

Hive export and import is working fine with normal tables. In case of partitioned and bucketed ORC tables, we are facing issues as there are further delta directories. Somehow Hive is not able to detect the data present in the table after import.

Don't have an account?
Coming from Hortonworks? Activate your account here