Created on 09-20-2018 03:16 AM - edited 09-16-2022 06:43 AM
I have configured Apache Drill 1.11 and Hive (CDH 5.15 parcel) which are kerberized, storage plugins are created and works fine to show db, tables etc. However when we run some statement like select * from hive.`sales` results in the below error : Can't get Master Kerberos principal for use as renewer
I have tried all options for the past 3 days before posting this in the forum. Any guidance will be very helpful. Thanks.
[245c9770-bfd6-0f9b-c535-9eca7a4ac03e:foreman] ERROR o.a.drill.exec.work.foreman.Foreman - SYSTEM ERROR: IOException: Can't get Master Kerberos principal for use as renewer
[Error Id: f1678f86-7603-40b7-8eda-4d56e128a760 on myzul02c.ad.infosys.com:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: IOException: Can't get Master Kerberos principal for use as renewer
.
.
.
Caused by: java.io.IOException: Failed to get numRows from HiveTable
at org.apache.drill.exec.store.hive.HiveMetadataProvider.getStats(HiveMetadataProvider.java:112) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.hive.HiveScan.getScanStats(HiveScan.java:229) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]
... 32 common frames omitted
Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: Failed to create input splits: Can't get Master Kerberos principal for use as renewer
at org.apache.drill.exec.store.hive.HiveMetadataProvider.splitInputWithUGI(HiveMetadataProvider.java:263) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.hive.HiveMetadataProvider.getTableInputSplits(HiveMetadataProvider.java:127) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]
at org.apache.drill.exec.store.hive.HiveMetadataProvider.getStats(HiveMetadataProvider.java:95) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]
... 33 common frames omitted
Caused by: java.io.IOException: Can't get Master Kerberos principal for use as renewer
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:116) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodesInternal(TokenCache.java:100) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
at org.apache.hadoop.mapreduce.security.TokenCache.obtainTokensForNamenodes(TokenCache.java:80) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:206) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315) ~[hadoop-mapreduce-client-core-2.7.1.jar:na]
at org.apache.drill.exec.store.hive.HiveMetadataProvider$1.run(HiveMetadataProvider.java:252) ~[drill-storage-hive-core-1.11.0.jar:1.11.0]
Created 09-25-2018 02:06 AM
Refer link : https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Superusers.html
If the cluster is running in Secure Mode, the superuser must have kerberos credentials to be able to impersonate another user.
It cannot use delegation tokens for this feature. It would be wrong if superuser adds its own delegation token to the proxy user ugi, as it will allow the proxy user to connect to the service with the privileges of the superuser.
However, if the superuser does want to give a delegation token to joe, it must first impersonate joe and get a delegation token for joe, in the same way as the code example above, and add it to the ugi of joe. In this way the delegation token will have the owner as joe.
This was the major clue. Hive impersonates user ‘drill’ to run queries. A quick sequence of events is listed below:
I went on researching how this works in CDH.
I hope the solution is useful. Marking it as resolved.
Created 09-25-2018 02:06 AM
Refer link : https://hadoop.apache.org/docs/r2.7.1/hadoop-project-dist/hadoop-common/Superusers.html
If the cluster is running in Secure Mode, the superuser must have kerberos credentials to be able to impersonate another user.
It cannot use delegation tokens for this feature. It would be wrong if superuser adds its own delegation token to the proxy user ugi, as it will allow the proxy user to connect to the service with the privileges of the superuser.
However, if the superuser does want to give a delegation token to joe, it must first impersonate joe and get a delegation token for joe, in the same way as the code example above, and add it to the ugi of joe. In this way the delegation token will have the owner as joe.
This was the major clue. Hive impersonates user ‘drill’ to run queries. A quick sequence of events is listed below:
I went on researching how this works in CDH.
I hope the solution is useful. Marking it as resolved.
Created 05-24-2019 03:10 AM
Dear All.
I have a table in hive but it taking more time due to huge data available in that table...
so plz guide me which tools will help me to get data under a second/mili..second
tools like:- Drill or Presto?
plz help.......
Thanks
HadoopHelp