Support Questions
Find answers, ask questions, and share your expertise

Can we schedule automatic incremental backup in hive without reconcile?

Can we schedule automatic incremental backup in hive without reconcile?

Explorer

Hi,

I want to know that like in mysql replication can we have any option in hive for incremental backup where we no need to the manual task again and again like creating base table and incremental table then reconcile it.

2 REPLIES 2
Highlighted

Re: Can we schedule automatic incremental backup in hive without reconcile?

Expert Contributor

Please elaborate more on your use case.

MySQL replication allows you to copy data from one server (master) to one or more servers (slaves).

In this context do you have multiple hive metastore running on different clusters? Do you want to replicate just the underlying Hive metadata information (schema) or data itself?

If you hare using MySQL as your metastore DB, you can use same MySQL replication to copy this information.

If you want to replicate data, it is already replicated by HDFS.

If you want to replicate data to another cluster, see distcp or Nifi.

Highlighted

Re: Can we schedule automatic incremental backup in hive without reconcile?

Expert Contributor

I think you meant to incremental update on the hive table?

As of now, I think that is the way to go. As you may aware HDFS itself is immutable, so everything you written is not changeable. So, the incremental update on the hive table is trying to mimic the update on the DBMS like mysql which it can update on the records. In recent version of Hive, the transaction function capability was added, which will allow the update on the records on the Hive table. But I personally have not test it on the incremental update using this capability.