Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Kerberos issue for GetHDFS But Not FetchHDFS/ListHDFS - NIFI

Contributor

Hi

I am using FetchHDFS nifi processor which is running fine to fetch the exact HDFS file. I want to get all HDFS files under a directory hence using GetHDFS by keeping the source file option as "True". But I am getting a Kerberos error saying "ERROR [Timer-Driven Process Thread-1] o.apache.nifi.processors.hadoop.GetHDFS GetHDFS[id=XXXXXXXXXX] Error retrieving file hdfs://XXXXXXXXXXXXXXXXXXXX.0. from HDFS due to java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt): {} java.io.IOException: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) Caused by: org.apache.hadoop.security.authentication.client.AuthenticationException: GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:332) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.authenticate(KerberosAuthenticator.java:205) Caused by: org.ietf.jgss.GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:311) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator$1.run(KerberosAuthenticator.java:287) at org.apache.hadoop.security.authentication.client.KerberosAuthenticator.doSpnegoSequence(KerberosAuthenticator.java:287)

I am wondering why Same Kerberos credentials are working for "FetchHDFS/ListHDFS" but not "GetHDFS".

"GetHDFS" need additional setup? Please suggest.

Thanks

Srikaran

3 REPLIES 3

Does the error go away if you stop and restart the GetHDFS processor? If so it may be a similar to a timing issue we had. Try setting the "Kerberos relogin period" on the processor to less than 20% of your Kerberos ticket lifetime. See: https://community.hortonworks.com/idea/155525/avoiding-kerberos-errors-in-nifi-hdfs-processors.html

Contributor
@Karl Fredrickson

Hi Karl..Same issue after Stop and restart. I tried 1 hour and 4 hours for Kerberos relogin period as I am using the same relogin period for FetchHDFS/ListHDFS. This is happening only for "GetHDFS". I am assuming "GetHDFS" processor is trying to delete/move or write which might need some other permissions. The HDFS files are owned by hive:hive with 771 permissions. With the same 771 permissions and hive:hive fetchhdfs & listhdfs is working. Thanks

@Srikaran Jangidi

When you say hive:hive is the owner I am assuming that is the user (Kerberos Principal) you are providing in your GetHDFS Processor. Also please check the permissions on the hdfs folders. The user (Kerberos Principal) has to have write permissions on the folders you are trying to delete files in or move files from and to. Can you try the same operation using a hdfs command on the console to confirm it works outside of nifi. Do a kerberos authentication using the same user and keytab and try a move comman

kinit -k -t hive.keytab hive
hdfs dfs -mv /user/hadoop/file1 /user/hadoop/file2