Created 03-27-2023 04:40 AM
Hello
We have a HDFS cluster configure to work with Kerberos and we want to use nifi processor PutParquet to write file into hadoop cluster.
do we need to configure nifi to work with kerberos too in order to connect to Hadoop?
what are the steps i need to take in order to do the connection between the two clusters (nifi --> HDFS)
Thanks
Created 03-27-2023 05:15 AM
At a minimum, this is required in NiFi to connect to the Hdfs service secured by Kerberos.
1. Copy of core-site.xml and hdfs-site.xml file from hdfs cluster needs to be placed on nifi host to be able to configure in for Hadoop Configuration Resources property
2. Kerberos user principal and copy of keytab file on each nifi node to able to use by KeytabCredentialsService or directly configured in processor.
3. From NiFi hosts, the user who is running the nifi application "by default is nifi " should be able to obtain Kerberos tickets by using the user principal and keytab file configured in processor at step2 , from KDC server which is used by HDFS Service, so krb5.conf Kerberos client file needs to be updated to respective kdc realm details.
If you found this response assisted with your issue, please take a moment and click on "Accept as Solution" below this post.
Thank you
Created 03-27-2023 05:15 AM
At a minimum, this is required in NiFi to connect to the Hdfs service secured by Kerberos.
1. Copy of core-site.xml and hdfs-site.xml file from hdfs cluster needs to be placed on nifi host to be able to configure in for Hadoop Configuration Resources property
2. Kerberos user principal and copy of keytab file on each nifi node to able to use by KeytabCredentialsService or directly configured in processor.
3. From NiFi hosts, the user who is running the nifi application "by default is nifi " should be able to obtain Kerberos tickets by using the user principal and keytab file configured in processor at step2 , from KDC server which is used by HDFS Service, so krb5.conf Kerberos client file needs to be updated to respective kdc realm details.
If you found this response assisted with your issue, please take a moment and click on "Accept as Solution" below this post.
Thank you
Created 03-27-2023 05:27 AM
Thanks for your reply,
step 2 - a. where can i find the user principle ? b. there are several keytabs for each service . which one do i need ?
step 3 - its not very clear , do i need to install kerberos in nifi cluster in order to
connect to hdfs cluster ? do i need to copy the krb5.conf file to nifi cluster ?
Thanks
Created 03-27-2023 06:28 AM
Regarding step 2, You have to determine the HDFS directory where NiFi PutParquet will write the files, and who has access to this directory path on HDFS, that user's user principal and associated keytab is required. I assume if HDFS is secured by Kerberos then the users has to obtain the Kerberos ticket by running kinit with user principal and Keytab to access it at the HDFS side.
About step 3. No need to install Kerberos service, NiFi needs a Kerberos client on NiFi hosts which is by default installed on most Linux OS.
client config files located at /etc/krb5.conf , to which Kerberos server NiFi PutParquet should connect in order to obtain kerbeors ticket using configured user pric/keytab details, user has updated Krb5.conf file with Kerberos Server details. I mean KDC realm details.
If you found this additional response assisted with your issue, please take a moment and click on "Accept as Solution" below this post.
Thank you
Created 03-27-2023 06:59 AM
Thanks,
i am getting the following error " ERROR org.apache.nifi.processors.parquet.PutParquet: PutParquet[id=c6dee132-cb63-3b8b-9148-ec10de8044c4] HDFS Configuration error - java.lang.IllegalArgumentException: Can't get Kerberos realm: java.lang.IllegalArgumentException: KrbException: Cannot locate default realm
↳ causes: java.lang.IllegalArgumentException: Can't get Kerberos realm
"