Created 01-04-2018 05:38 PM
Hi
I am using FetchHDFS nifi processor which is running fine to fetch the exact HDFS file. I want to get all HDFS files under a directory hence using GetHDFS by keeping the source file option as "True". But I am getting a Kerberos error saying "ERROR [Timer-Driven Process Thread-1] o.apache.nifi.processors.hadoop.GetHDFS GetHDFS[id=XXXXXXXXXX] Error retrieving file hdfs://XXXXXXXXXXXXXXXXXXXX.0. from HDFS due to java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt): {} java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:311) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:287) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:287)
I am wondering why Same Kerberos credentials are working for "FetchHDFS/ListHDFS" but not "GetHDFS".
"GetHDFS" need additional setup? Please suggest.
Thanks
Srikaran
Created 01-04-2018 06:21 PM
Does the error go away if you stop and restart the GetHDFS processor? If so it may be a similar to a timing issue we had. Try setting the "Kerberos relogin period" on the processor to less than 20% of your Kerberos ticket lifetime. See: https://community.hortonworks.com/idea/155525/avoiding-kerberos-errors-in-nifi-hdfs-processors.html
Created 01-04-2018 06:37 PM
Hi Karl..Same issue after Stop and restart. I tried 1 hour and 4 hours for Kerberos relogin period as I am using the same relogin period for FetchHDFS/ListHDFS. This is happening only for "GetHDFS". I am assuming "GetHDFS" processor is trying to delete/move or write which might need some other permissions. The HDFS files are owned by hive:hive with 771 permissions. With the same 771 permissions and hive:hive fetchhdfs & listhdfs is working. Thanks
Created 01-04-2018 10:33 PM
When you say hive:hive is the owner I am assuming that is the user (Kerberos Principal) you are providing in your GetHDFS Processor. Also please check the permissions on the hdfs folders. The user (Kerberos Principal) has to have write permissions on the folders you are trying to delete files in or move files from and to. Can you try the same operation using a hdfs command on the console to confirm it works outside of nifi. Do a kerberos authentication using the same user and keytab and try a move comman
kinit -k -t hive.keytab hive
hdfs dfs -mv /user/hadoop/file1 /user/hadoop/file2