We have cluster with Sentry enabled. I know that hive (server1) does not use sentry so it lists all the databases.but when I am connected to Hive server 2 which uses Sentry authroization policy files (I believe Impala and Hive share the same sentry policy files), show databases lists very few , say (A,B,C,D) where as Impala gives me list as A,B,C,D,E,F which is perfect list as per the authorization rules set. So not sure why Hive server 2 does not list the correct ones, to my surprise if you say USE <E> then it perfectly runs through and I can read the tables under schema. I also tested if Hive Server 2 is bypassing the sentry, no it is not, If I say USE <H>, here H is unauthorized for me as per sentry, then Hive server 2 throws me out. Error: Error while compiling statement: FAILED: SemanticException No valid privileges (state=42000,code=40000). I think this seems to be a BUG, more over this is a problem while we integrate Hive with Visualization tools such as Tableau, because it does not list the expected databases.
Now open for discussions your suggestions, I would also like to raise this BUG, please let me know If I can raise this with cloudera as we are using CDH or we have to go Apache Hive Bug tracking system.
Only databases which a user has some permissions on (or permissions on its tables) will be visible when the user does show databases. "Default" database is the only exception here, which will come up in the list even if users do not have explicit permissions on any object in default database, this is for compatibility with some BI tools. This exception is always true for Hive and true for Impala since CDH5.2.
Do you see any behavior difference apart from this exception?
Yes I see different, As I mentioned , I am not able see the authorized tables as well, where as I can see the same throough Impala. More over If issue USE <tablename authorized but not visible>then it allows. I think I explained this clearly with example
Sree, thanks for the detailed explanation of the problem. Which CDH version are you on?
i don't know why so many people use policy file, why not use sentry database.
Well, Policy files were used earlier and currently there are plans to upgrade to using database but neverthless we should understand what's happening here right? Any thoughts towards better RCA.