Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can I configure Cloudera Impala to use Hortonworks Hive metastore

Highlighted

Can I configure Cloudera Impala to use Hortonworks Hive metastore

New Contributor

Hello All,

     I have a HDP cluster with HIVE services running and i'm planning to set up a separate CDH cluster with Impala service. Can I configure Impala to use the HIVE metastore in HDP cluster.

 

3 REPLIES 3

Re: Can I configure Cloudera Impala to use Hortonworks Hive metastore

Cloudera Employee

Hi @pauljoshiva,

In theory it should be possible, however CDH and HDP releases are not tested together and shipped with different Hive Metastore versions, the unity release will be CDP.

I can see 2 possible approaches, please note that I have not tried these and there might be skeletons in the closet:

  1. Using the CDH HMS binaries to connect to the central HMS backend database. The main problem could be the HMS schema which can differ in releases, especially between major releases, for example HDP 3.x is shipped with Hive 3, HDP 2.6.x is shipped with Hive 2, while CDH 6.x is packaged with a patched Hive 2, although some Hive 3 fixes can be available in CDH 6 as well. The metastore schema compatibility between releases can be verified with the Metastore Schema tool, this could rule out the feasibility of this option fast. Also, DBTokenStore should be enabled for both HMS.
  2. Pointing Impala to use the HDP HMS. There might be API differences between the HMS binaries that could cause unexpected Impala behavior. This can be mitigated by picking versions as close as possible, however due to the nature of the CDH Hive release, as it is patched with newer fixes, there could still be differences.

Additionally, would recommend creating a backup of the databases that can be affected and contain important metadata.

Re: Can I configure Cloudera Impala to use Hortonworks Hive metastore

New Contributor

@tmater Thanks for your reply.

So I have HDP 3.1.0 with HIVE 3.0.0 installed, what CDH version would be compatible to this?

Re: Can I configure Cloudera Impala to use Hortonworks Hive metastore

Cloudera Employee

The current newest CDH release 6.3.2 has a patched Hive 2.1.1. With a major release difference I believe there will be both HMS schema difference and HMS API difference as well.

Depending on the use-case, during the POC period, the data/metadata could be migrated to the CDH cluster and work on the performance. Later, when Impala is well-tried a workflow could be built where the clusters are working on the tasks that are the most suitable for specific components.

Don't have an account?
Coming from Hortonworks? Activate your account here