Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Need Clarification for Hive Data Server Configuration in ODI

avatar
Explorer

I need clarification about Metastore URI configuration of Hive data server. The problem is that in the official cloudera documentation you can see that it is a requirement to block external applications accessing Hive Metastore in order to secure cluster with Sentry:

 

https://www.cloudera.com/documentation/enterprise/latest/topics/sg_sentry_service_config.html#concep...

 

Block the external applications from accessing the Hive metastore:

  • In the Cloudera Manager Admin Console, select the Hive service.
  • On the Hive service page, click the Configuration tab.
  • In the search well on the right half of the Configuration page, search for Hive Metastore Access Control and Proxy User Groups Override to locate the hadoop.proxyuser.hive.groups parameter and click the plus sign.
  • Enter hive into the text box and click the plus sign again.
  • Enter hue into the text box.
  • Enter sentry into the text box.
  • Click Save Changes.

 

ODI is also an external application so its direct access requirement to Metastore contradicts with this requirement.

 

Why are we setting Metastore URI in Hive data server? What is it used for? If it is a requirement, how can we explain this contradiction?

 

Another problem is that we are configuring Metastore HA by default in our BDA. However how can we configure ODI Metadata Uri to support metadata HA configuration? Is there any way to write a Uri supporting connecting more than one Metastore? If not, how can we explain this situation?  

1 ACCEPTED SOLUTION

avatar
Expert Contributor

Hello, 

 

Oracle Data Integrator connects to Hive by using JDBC and uses Hive and the Hive Query Language (HiveQL), a SQL-like language for implementing MapReduce jobs. Source - HERE

 

The points mentioned by you from the documentation is for the purpose of Blocking the external applications and non service users from accessing the Hive metastore.

 

Since, ODI connects to Hive using JDBC, it should connect to HiveServer2 as described in this documentation. Once connected, the query executed from ODI will connect with HiveServer2. Then, HiveServer2 will connect with HiveMetastore for getting the metadata details of the table against which you are querying and proceed with the execution. It is not necessary for ODI to connect to Hive MetaStore directly. 

 

For details about Hive Metastore HA, please read HERE

View solution in original post

1 REPLY 1

avatar
Expert Contributor

Hello, 

 

Oracle Data Integrator connects to Hive by using JDBC and uses Hive and the Hive Query Language (HiveQL), a SQL-like language for implementing MapReduce jobs. Source - HERE

 

The points mentioned by you from the documentation is for the purpose of Blocking the external applications and non service users from accessing the Hive metastore.

 

Since, ODI connects to Hive using JDBC, it should connect to HiveServer2 as described in this documentation. Once connected, the query executed from ODI will connect with HiveServer2. Then, HiveServer2 will connect with HiveMetastore for getting the metadata details of the table against which you are querying and proceed with the execution. It is not necessary for ODI to connect to Hive MetaStore directly. 

 

For details about Hive Metastore HA, please read HERE