Hi..Can we copy data from a Cloudera Kerberized Cluster to Hortonworks Kerberized Cluster via Cross Realm Setup? Assuming both clusters are having different Hadoop, hive, Hbase versions and also both clusters have complex network topologies and what not! Please let me know.
Have you seen the following article... it may be what you are looking for: https://community.hortonworks.com/articles/18686/kerberos-cross-realm-trust-for-distcp.html
If this helps, thanks to @jbarnett for giving me the link regarding a similar scenario that he is working on.
I am looking for "Cloudera(CDH)" to "Hortonworks(HDP)" cluster migration. Since both are having different packages for cluster build and might have different Hadoop/Hive/Hbase versions and with Hadoop being not backward Compatible (??) will distcp works in this case? Please let me know!
If you leverage distcp with hftp protocol you should be fine, it is client independent implementation and only caveat is that source cluster is read only. https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/Hftp.html
Thanks. So even if both clusters are Kerberized with Cross-Realm functionality + hftp protocol we should be good rt? Also can you please let me know what do you mean by Source Cluster is "Read Only"? Do you mean when the hftp distcp is happening Soure Cluster shouldn't have any write operations happening on that cluster?