Created on 10-15-2018 03:48 PM - edited 08-17-2019 08:49 PM
I am implementing the following access pattern from Tableau (server or desktop) to Hiveserver2:
Tableau Desktop MacOS + Kerberos --> Knox (topology with Kerberos/hadoop-auth) --> Hiveserver2 Kerberos
I got this to work, but the only thing that is lacking is adding 'user impersonation'. What I mean by that is similar behaviour you get when using beeline and having the:
;hive.server2.proxy.user=<end-user>
switch to have Ranger policies for <end-user> applied and not for the service user that is Kerberos authenticated to Hiveserver2 (that would be 'knox' in this case).
So that is the caveat at the moment; all hive interactions show as user 'knox' on Ranger audits and not the <end-user>.
On the Windows version of the Hortonworks Hive ODBC driver there is an input box for 'Delegation UID' which seems to be just the option I am after, but on the OsX version of the driver it is different. You can manage many options on the odbc.ini files
cat /Library/hortonworks/hive/Setup/odbc.ini [ODBC] # Specify any global ODBC configuration here such as ODBC tracing. [ODBC Data Sources] Hortonworks Hive=Hortonworks Hive ODBC Driver [Hortonworks Hive] # Description: DSN Description. # This key is not necessary and is only to give a description of the data source. Description=Hortonworks Hive ODBC Driver DSN # Driver: The location where the ODBC driver is installed to. Driver=/Library/hortonworks/hive/lib/universal/libhortonworkshiveodbc.dylib # When using No Service Discovery, specify the IP address or host name of the Hive server. # When using ZooKeeper as the Service Discovery Mode, specify a comma-separated list of ZooKeeper # servers in the following format: # <zk_host1:zk_port1>,<zk_host2:zk_port2>,... HOST= # The TCP port Hive server is listening. This is not required when using ZooKeeper as the service # discovery mode as the port is specified in the HOST connection attribute. PORT= # The name of the database schema to use when a schema is not explicitly specified in a query. Schema=default # Set to 0 to when connecting directory to Hive Server 2 (No Service Discovery). # Set to 1 to do Hive Server 2 service discovery using ZooKeeper. # Note service discovery is not support when using Hive Server 1. ServiceDiscoveryMode=0 # The namespace on ZooKeeper under which Hive Server 2 znodes are added. Required only when doing # HS2 service discovery with ZooKeeper (ServiceDiscoveryMode=1). ZKNamespace= # Set to 1 if you are connecting to Hive Server 1. Set to 2 if you are connecting to Hive Server 2. HiveServerType=2 # The authentication mechanism to use for the connection. # Set to 0 for No Authentication # Set to 1 for Kerberos # Set to 2 for User Name # Set to 3 for User Name and Password # Note only No Authentication is supported when connecting to Hive Server 1. AuthMech=1 # The Thrift transport to use for the connection. # Set to 0 for Binary # Set to 1 for SASL # Set to 2 for HTTP # Note for Hive Server 1 only Binary can be used. ThriftTransport=2 # When this option is enabled (1), the driver does not transform the queries emitted by an # application, so the native query is used. # When this option is disabled (0), the driver transforms the queries emitted by an application and # converts them into an equivalent from in HiveQL. UseNativeQuery=0 # Set the UID with the user name to use to access Hive when using AuthMech 2 to 8. UID= # The following is settings used when using Kerberos authentication (AuthMech 1 and 10) # The fully qualified host name part of the of the Hive Server 2 Kerberos service principal. # For example if the service principal name of you Hive Server 2 is: # hive/myhs2.mydomain.com@EXAMPLE.COM # Then set KrbHostFQDN to myhs2.mydomain.com KrbHostFQDN=_HOST # The service name part of the of the Hive Server 2 Kerberos service principal. # For example if the service principal name of you Hive Server 2 is: # hive/myhs2.mydomain.com@EXAMPLE.COM # Then set KrbServiceName to hive KrbServiceName=hive # The realm part of the of the Hive Server 2 Kerberos service principal. # For example if the service principal name of you Hive Server 2 is: # hive/myhs2.mydomain.com@EXAMPLE.COM # Then set KrbRealm to EXAMPLE.COM KrbRealm=EDL.DEV.BASF.COM # Set to 1 to enable SSL. Set to 0 to disable. SSL=1 # Set to 1 to enable two-way SSL. Set to 0 to disable. You must enable SSL in order to # use two-way SSL. TwoWaySSL=0 # The file containing the client certificate in PEM format. This is required when using two-way SSL. ClientCert=/Users/jknulst/Documents/Customers/BASF/Knox/gateway-identity.pem # The client private key. This is used for two-way SSL authentication. ClientPrivateKey= # The password for the client private key. Password is only required for password protected # client private key. ClientPrivateKeyPassword=
So there is an option
# Set the UID with the user name to use to access Hive when using AuthMech 2 to 8. UID=
but it states specifically that it can't be used in tandem with Kerberos auth.
So, the question is ; When connecting Tableau to Hiveserver2 can you pass the desired end-user to Hiveserver2 using the HWX ODBC Hive driver to get similar impersonation like using 'hive.server2.proxy.user' on a jdbc connect string ?
Created 01-21-2019 10:31 AM
Hi! Have you solved this problem? If yes, could you please share your solution? I have the same situation I need to resolve. Thanks!
Created 01-22-2019 07:48 AM
If I remember it correctly it was quite simple in the end. Just make sure you have the snippet below in your topology:
<provider> <role>identity-assertion</role> <name>Default</name> <enabled>true</enabled> </provider>
This should get you the default behaviour where Knox propagates your identity forward to the proxied service, not its own