Support Questions
Find answers, ask questions, and share your expertise

Spark Query to join hive Tables from different clusters

New Contributor



Assuming we have 2 cloudera hadoop clusters (with Kerberos authentication) and will like to perform joins on hive tables from each cluster. Is it possible to submit a spark job in one of the cluster to perform the join? Else, what other ways? We are trying to avoid copying the data over to 1 cluster for the join. 

any available guides or documents we can reference to configure for such use case? Will there be any dependencies on software versions? 

Thank you. Appreciate the advices.