Created 05-25-2017 10:44 PM
Hey,
Need help on Kerberos integration... We have integrated NiFi with Kerberos by configuring it with necessary properties....and we are trying to use "Hive Streaming" processor and it is not working due to Kerberos issue.
When I look at the nifi-app.log, I see the following error followed by Validator exception
2017-05-25 16:50:01,749 INFO [StandardProcessScheduler Thread-1] o.a.nifi.dbcp.hive.HiveConnectionPool HiveConnectionPool[id=638acb69-015b-1000-0000-000037676c20] Hive Security Enabled, logging in as principal null with keytab null
The Kerberos Principal and the Keytab files are both "null". Oops oops....
1. nifi.kerberos.krb5.file is SET and is "accessible" for nifi account
2. nifi.kerberos.service.keytab.location is SET and the file is "accesible" for nifi account
3. nifi.kerberos.service.principal is SET (format is nifi/HOST@ClusterName)
SPNEGO authentication is also set. But I guess that is for authenticating with NiFi I believe....(or Correct me here..)
So, Any help appreciated. Thanks!
Created 05-26-2017 04:02 PM
There are two different things...
- HiveConnectionPool for use with the PutHiveQL/SelectHiveQL processors which go through the JDBC interface
- PutHiveStreaming for ingesting through Hive streaming
You mentioned Hive Streaming, but then your log shows HiveConnectionPool which is not for Hive streaming.
Either way, HiveConnectionPool and PutHiveStreaming both have properties in their config for Kerberos Principal and Kerberos Keytab which need to be filled in, and I believe when using HiveConnectionPool the JDBC connection string also needs the principal specified.
The stuff nifi.kerberos.service.keytab.location and nifi.kerberos.service.principal in nifi.properties is not used by processors, only for framework level things where NiFi needs to talk to another service.
Created 05-26-2017 04:02 PM
There are two different things...
- HiveConnectionPool for use with the PutHiveQL/SelectHiveQL processors which go through the JDBC interface
- PutHiveStreaming for ingesting through Hive streaming
You mentioned Hive Streaming, but then your log shows HiveConnectionPool which is not for Hive streaming.
Either way, HiveConnectionPool and PutHiveStreaming both have properties in their config for Kerberos Principal and Kerberos Keytab which need to be filled in, and I believe when using HiveConnectionPool the JDBC connection string also needs the principal specified.
The stuff nifi.kerberos.service.keytab.location and nifi.kerberos.service.principal in nifi.properties is not used by processors, only for framework level things where NiFi needs to talk to another service.
Created 05-26-2017 05:04 PM
Ah... Thanks! Can you point me to where I can find these options? I will check NiFi manual anyway. Thanks a lot! Will post back if anything solved our problem
Created 05-26-2017 05:14 PM
Look for Kerberos Principal and Kerberos Keytab on these pages:
Created 05-26-2017 07:13 PM
Yo! Finally got it working. The "null" pointer exceptions in the log were from different Processors and not from what I was looking at.. I had a workaround from Horton support too. And, I implemented that, tested it and saw data streaming into Hive table. Thanks a lot! Your answer was helpful in getting my understanding right. Thanks a ton for that.
Created 05-26-2017 05:37 PM
Thanks..I am actually trying to debug a flow written by some1 else... So, I am learning stuff. I see that the "log" mentions the unique ID which I think corresponds to some processor... Now, my job is to find the processor that correspond to the IDs which are creating Null pointer exceptions..... I have been trying to click, click and click to find out... Is there any easier way out?
Created 05-26-2017 05:54 PM
In the top right corner of the NiFi UI there should be a search icon, click in there and enter the id you are looking for, then it should show a list of results and you can click the component and it will take you right to it.
Created 05-26-2017 07:12 PM
Thank you Sir. I tried that... and when I open the Processor that the search result refers to, it has a different ID... Why would it show me like that? Thanks again, for saving my time. Great feature.
Created 05-26-2017 08:08 PM
I found the answer to this myself. When you search for ID, it does not necessarily mean "Processor ID". In my case, it was the "Controller Service" ID which was present inside the Processor. Thats a good learning for the day. Thanks much.