Created 09-10-2018 01:42 PM
What different options are available other than distcp to copy data including hive metadta or hive tables and hdfs data between two hdp clusters?
Created 09-10-2018 02:19 PM
Hello @tauqeer khan.
If you're using HDP, then you can give it a shot with Falcon:
The other thing would be to check the following links:
https://cwiki.apache.org/confluence/display/Hive/Replication
https://medium.com/@anishekagarwal/aapache-hive-introduction-to-replication-v2-2e12edcbeec
And lastly, you want to replicate Hive and you don't wanna use distcp or none of the solutions listed above, you can try to use the following apache project from AirBnb (i've used once, it's pretty cool )
https://github.com/airbnb/reair
Hope this helps!
Created 09-10-2018 02:26 PM
Thanks for the answer @Vinicius Higa Murakami .
Can we use Nifi ?
Created 09-10-2018 02:55 PM
Hmm, @tauqeer khan good question.
I've never tested it myself.
But at first, glance, guess you can take a look at the following answer:
https://community.hortonworks.com/questions/182344/how-to-copy-data-from-a-hive-table-recurrently-us...
And try to build something similar 🙂
BTW, it's a good idea to test it, let us know if it works or if you face anything.
Hope this helps!