Support Questions

Find answers, ask questions, and share your expertise

Drill - Hive connectivity : Can't get Master Kerberos principal for use as renewer

avatar
Explorer

I have configured Apache Drill 1.11 and Hive (CDH 5.15 parcel) which are kerberized, storage plugins are created and works fine to show db, tables etc. However when we run some statement like select * from hive.`sales` results in the below error : Can't get Master Kerberos principal for use as renewer

 

I have tried all options for the past 3 days before posting this in the forum. Any guidance will be very helpful. Thanks.

 

[245c9770-bfd6-0f9b-c535-9eca7a4ac03e:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: IOException: Can't get Master Kerberos principal for use as renewer

[Error Id: f1678f86-7603-40b7-8eda-4d56e128a760 on myzul02c.ad.infosys.com:31010]

org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IOException: Can't get Master Kerberos principal for use as renewer

.

.

.

 

Caused by: java.io.IOException: Failed to get numRows from HiveTable

        at org.apache.drill.exec.store.hive.HiveMetadataProvider.getStats(HiveMetadataProvider.java:112) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]

        at org.apache.drill.exec.store.hive.HiveScan.getScanStats(HiveScan.java:229) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]

        ... 32 common frames omitted

Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Failed to create input splits: Can't get Master Kerberos principal for use as renewer

        at org.apache.drill.exec.store.hive.HiveMetadataProvider.splitInputWithUGI(HiveMetadataProvider.java:263) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]

        at org.apache.drill.exec.store.hive.HiveMetadataProvider.getTableInputSplits(HiveMetadataProvider.java:127) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]

        at org.apache.drill.exec.store.hive.HiveMetadataProvider.getStats(HiveMetadataProvider.java:95) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]

        ... 33 common frames omitted

Caused by: java.io.IOException: Can't get Master Kerberos principal for use as renewer

        at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:116) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]

        at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]

        at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]

        at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:206) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]

        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]

        at org.apache.drill.exec.store.hive.HiveMetadataProvider$1.run(HiveMetadataProvider.java:252) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]

1 ACCEPTED SOLUTION

avatar
Explorer

Refer link : https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Superusers.html

 

If the cluster is running in Secure Mode, the superuser must have kerberos credentials to be able to impersonate another user.

It cannot use delegation tokens for this feature. It would be wrong if superuser adds its own delegation token to the proxy user ugi, as it will allow the proxy user to connect to the service with the privileges of the superuser.

However, if the superuser does want to give a delegation token to joe, it must first impersonate joe and get a delegation token for joe, in the same way as the code example above, and add it to the ugi of joe. In this way the delegation token will have the owner as joe.

 

This was the major clue. Hive impersonates user ‘drill’ to run queries. A quick sequence of events is listed below:

  1. Impersonation on drill was enabled (in drill-override.conf) --> follow this article : https://drill.apache.org/docs/configuring-user-impersonation-with-hive-authorization/
  2. Kerberos token for user drill was generated,
  3. drillbit is started successfully,
  4. Hive tries to impersonate as “drill” and gets error “Can't get Master Kerberos principal for use as renewer”.

 

I went on researching how this works in CDH.

  • CDH uses Node manager as renewer, and this is provided by YARN gateway. So in the case of Hive (through CDH), it works perfectly fine as long as YARN gateway role instance is enabled.
  • Drill is external to CDH, and to access Hive, it cant impersonate without superuser’s credentials. The superuser is YARN !!! Yes, it tries to use YARN’s details for impersonation. This information can be obtained from yarn-site.xml. After placing core-site.xml and yarn-site.xml within the drill’s conf directory, it started working. I am able to query Hive tables successfully.

 

I hope the solution is useful. Marking it as resolved.

 

View solution in original post

2 REPLIES 2

avatar
Explorer

Refer link : https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Superusers.html

 

If the cluster is running in Secure Mode, the superuser must have kerberos credentials to be able to impersonate another user.

It cannot use delegation tokens for this feature. It would be wrong if superuser adds its own delegation token to the proxy user ugi, as it will allow the proxy user to connect to the service with the privileges of the superuser.

However, if the superuser does want to give a delegation token to joe, it must first impersonate joe and get a delegation token for joe, in the same way as the code example above, and add it to the ugi of joe. In this way the delegation token will have the owner as joe.

 

This was the major clue. Hive impersonates user ‘drill’ to run queries. A quick sequence of events is listed below:

  1. Impersonation on drill was enabled (in drill-override.conf) --> follow this article : https://drill.apache.org/docs/configuring-user-impersonation-with-hive-authorization/
  2. Kerberos token for user drill was generated,
  3. drillbit is started successfully,
  4. Hive tries to impersonate as “drill” and gets error “Can't get Master Kerberos principal for use as renewer”.

 

I went on researching how this works in CDH.

  • CDH uses Node manager as renewer, and this is provided by YARN gateway. So in the case of Hive (through CDH), it works perfectly fine as long as YARN gateway role instance is enabled.
  • Drill is external to CDH, and to access Hive, it cant impersonate without superuser’s credentials. The superuser is YARN !!! Yes, it tries to use YARN’s details for impersonation. This information can be obtained from yarn-site.xml. After placing core-site.xml and yarn-site.xml within the drill’s conf directory, it started working. I am able to query Hive tables successfully.

 

I hope the solution is useful. Marking it as resolved.

 

avatar
Contributor

Dear All.

 

I have a table in hive but it taking more time due to huge data available in that table...

so plz guide me which tools will help me to get data under a second/mili..second 

tools like:- Drill or Presto?

 

plz help.......

 

 

 

 

 

Thanks

HadoopHelp