Member since
06-08-2016
10
Posts
3
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4806 | 06-21-2016 12:27 PM |
03-02-2018
12:42 PM
@spolavarapu Thank you. This works as I wanted. Previously I missed the point that when "group search first"=YES and "enable user serch"=YES, the usersync service does a kind of join between the list of users extracted from group definitions and the users from user search query. In my case i must implement a small optimization. The AD, I will bind to eventually, is a large directory of users. I don't want the user search to get the whole directory so I created an additional group. It'll be used expose it's members to Hadoop. Of course I added the group to the user search filter. To make a user available in Ranger, I will add it to this group and other, more specific groups (which will act as roles). Regards, Pit
... View more
03-01-2018
05:21 PM
1 Kudo
Hi,
I need to synchronize users from AD to Ranger. My requirement is fairly simple: synchronize all users who belong to specific groups. The group names are prefixed with a specific keyword. I found a way to achieve this, but I am wondering if there's a better way. What's important, I want to use the AD attribute "sAMAccountName" as the user name. In Ambari, in the Ranger User Info tab I tried certain combinations of usersync parameters. 1. At first I tried: Group Search Filter: (cn=myprefix*) Enable Group Sync: YES Enable Group Search First: YES Enable User Search: NO This was almost what I wanted, but there is a problem with that. The user name is taken directly from the "member" attribute in the group record and in my case it is different from sAMAccountName. 2. I changed the above settings to: Group Search Filter: (cn=myprefix*) Enable Group Sync: YES Enable Group Search First: YES User Search Filter: (memberOf=cn=myprefix*) Enable User Search: YES Group User Map Sync: YES
Unfortunately this doesn't work because LDAP built into AD does not support wildcard queries on the memberOf attribute (at least the version I have). 3. I changed the above settings to: Group Search Filter: (cn=myprefix*) Enable Group Sync: YES Enable Group Search First: YES User Search Filter: (|(memberOf=CN=myprefix-group1,CN=Users,DC=mydomain,DC=local)(memberOf=CN=myprefix-group2,CN=Users,DC=mydomain,DC=local)) Enable User Search: YES Group User Map Sync: YES
And this does what I want. The problem I have with it is that I need to clearly specify the names of the groups in the user search filter. The group names and the number of them will change in the future and I would like to avoid changing the filter each time. Is it possible at all? Thanks and Regards, Pit
... View more
Labels:
- Labels:
-
Apache Ranger
06-21-2016
12:27 PM
1 Kudo
We've managed to solve the problem. Deeper examination of the
JDBC communication between the client and the hive server shows that the cookie
authentication mechanism, which should prevent subsequent authentication calls
within a single session requires the http server with SSL. Solution: Either of the following
resolves the issue:
Enable SSL for hiveserver2 in http transport mode for the
default configuration of the service.
If you don’t need SSL, disable the
requirement for secure cookies. Set the parameter
hive.server2.thrift.http.cookie.is.secure=false in hiveserver2-site.xml. Note: the hiveserver2
documentation lacks detailed information about the cookie authentication
mechanism. Only code and component debugging/tracing may shed some light on the
investigation.
... View more
06-16-2016
12:19 PM
I managed to isolate the issue a bit. Looks that it's not related to oozie jobs but can be observed in java code connecting to hiveserver2 over JDBC. I wrapped a query in some java code and run in my test environment. In the code I make a jdbc connection to hiveserver2. Before the connection is established, I make kerberos authentication. I wrote two separate apps, one uses the Java GSS API, the other one authenticates using UserGroupInformation.loginUserFromKeytab. When the hiveserver2 is in the binary transport mode, both apps perform well. When the hiveserver2 is in the HTTP transport mode, the job which uses Java GSS API calls my KDC before each fetch operation. I run the app with JGSS debugging. Before each fetch the following log is printed: Search Subject for Kerberos V5 INIT cred (<<DEF>>, sun.security.jgss.krb5.Krb5InitCredential)
Found ticket for piterr@MYIPADOMAIN.BEETLE.INT to go to krbtgt/MYIPADOMAIN.BEETLE.INT@MYIPADOMAIN.BEETLE.INT expiring on Fri Jun 17 09:02:29 CEST 2016
Entered Krb5Context.initSecContext with state=STATE_NEW
Service ticket not found in the subject
>>> Credentials acquireServiceCreds: same realm
Using builtin default etypes for default_tgs_enctypes
default etypes for default_tgs_enctypes: 18 17 16 23.
>>> CksumType: sun.security.krb5.internal.crypto.RsaMd5CksumType
>>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
getKDCFromDNS using UDP
>>> KrbKdcReq send: kdc=fipasrv.beetle.int. UDP:88, timeout=30000, number of retries =3, #bytes=726
>>> KDCCommunication: kdc=fipasrv.beetle.int. UDP:88, timeout=30000,Attempt =1, #bytes=726
>>> KrbKdcReq send: #bytes read=706
>>> KdcAccessibility: remove fipasrv.beetle.int.:88
>>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
>>> KrbApReq: APOptions are 00000000 00000000 00000000 00000000
>>> EType: sun.security.krb5.internal.crypto.Aes256CtsHmacSha1EType
Krb5Context setting mySeqNumber to: 879752450
Krb5Context setting peerSeqNumber to: 0
Created InitSecContextToken:
0000: 01 00 6E 82 02 69 30 82 02 65 A0 03 02 01 05 A1 ..n..i0..e......
0010: 03 02 01 0E A2 07 03 05 00 00 00 00 00 A3 82 01 ................
0020: 72 61 82 01 6E 30 82 01 6A A0 03 02 01 05 A1 18 ra..n0..j.......
0030: 1B 16 4D 59 49 50 41 44 4F 4D 41 49 4E 2E 42 45 ..MYIPADOMAIN.BE
0040: 45 54 4C 45 2E 49 4E 54 A2 26 30 24 A0 03 02 01 ETLE.INT.&0$....
....
When I switch back to the binary transport mode, everything works smoothly.
... View more
06-16-2016
02:14 AM
Thanks for the answers. @Sunile Manjee - we have a dedicated KDC which responds quite quickly, no network issues so far. @Terry Stebbens - we've already bumped the fetch size and that helped a lot. I am wondering if that's ok that each fetch is authenticated.
... View more
06-08-2016
04:44 AM
1 Kudo
Hi, We're running hiveserver2 in a kerberized cluster. The hiveserver2 processes run on edge nodes. The HDP version is 2.3.0.0-2557, the hive version is 1.2.1.2.3. We've got an automatic oozie job (written in java), which executes a hive query. The performance of the job has become much worse after setting up Kerberos in the cluster (at least our developers say so). The job uses the hive jdbc driver to connect to hiveserver2 in the HTTP transport mode, we also use the ZooKeeper service discovery. I am new to this specific HDP cluster and I am trying to understand what is going on. When the job is running, in the hiveserver2.log I see lot's of errors like below: 2016-06-06 08:46:07,011 INFO [HiveServer2-HttpHandler-Pool: Thread-440]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(169)) - Cookie added for clientUserName oozie
2016-06-06 08:46:07,038 INFO [HiveServer2-HttpHandler-Pool: Thread-440]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doPost(127)) - Could not validate cookie sent, will try to generate a new cookie
2016-06-06 08:46:07,039 INFO [HiveServer2-HttpHandler-Pool: Thread-440]: thrift.ThriftHttpServlet (ThriftHttpServlet.java:doKerberosAuth(352)) - Failed to authenticate with http/_HOST kerberos principal, trying with hive/_HOST kerberos principal The number of above entries for each job matches more or less the number of fetches (row_count/50). I understand that for some reason the cookie authentication doesn't work properly. Moreover something is wrong with Kerberos authentication. On the KDC server, in the krb5kdc.log I see hundreds of thousands entries like below: Jun 07 13:37:06 kdcsrv01 krb5kdc[5469](info): TGS_REQ (4 etypes {18 17 16 23}) 10.141.5.25: ISSUE: authtime 1465306605, etypes {rep=18 tkt=18 ses=18}, oozie@BDATA.COM for hive/edge01@BDATA.COM I re-executed a query (which is normally run by the job) in beeline. It triggers exactly the same problem in hiveserver2.log as described above. I tried both: cookieAuth=true and false. I have limited access to the KDC machine and can't confirm now if the same issue is observed in krb5kdc.log in this case. Any ideas how to proceed with the investigation will be appreciated. Regards, Pit
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Ranger