I have a HDP cluster with HIVE services running and i'm planning to set up a separate CDH cluster with Impala service. Can I configure Impala to use the HIVE metastore in HDP cluster.
In theory it should be possible, however CDH and HDP releases are not tested together and shipped with different Hive Metastore versions, the unity release will be CDP.
I can see 2 possible approaches, please note that I have not tried these and there might be skeletons in the closet:
Additionally, would recommend creating a backup of the databases that can be affected and contain important metadata.
The current newest CDH release 6.3.2 has a patched Hive 2.1.1. With a major release difference I believe there will be both HMS schema difference and HMS API difference as well.
Depending on the use-case, during the POC period, the data/metadata could be migrated to the CDH cluster and work on the performance. Later, when Impala is well-tried a workflow could be built where the clusters are working on the tasks that are the most suitable for specific components.