Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

NiFi and Kerberos delegation

Explorer

Hi,

  1. Do NiFi processors interacting with Hadoop components (such as GetHDFS or PutHDFS) use a delegation token when used in a Kerberized environnment ? I'd like that any action done by a NiFi processor to be done on behalf of the logged in user.
  2. Is this possible without Kerberos ?

Thanks

7 REPLIES 7

@David D

The GetHDFS/PutHDFS use the configured principal when interacting with a kerberos enabled environment.

What do you mean by logged in user? The user runnign NiFi?

@David D

  1. Do NiFi processors interacting with Hadoop components (such as GetHDFS or PutHDFS) use a delegation token when used in a Kerberized environnment ? I'd like that any action done by a NiFi processor to be done on behalf of the logged in user.

No actions performed by nifi processors like GetHDFS or PutHDFS (or any processor AFAIK) are performed on behalf of the logged in user. Processor actions are performed by the nifi nodes.

2. Is this possible without Kerberos ? If hadoop cluster is kerberized you need to use keytabs to communicate from nifi to hadoop cluster.

HTH

*** If you found this answer addressed your question, please take a moment to login and click the "accept" link on the answer.

Master Guru
@David D

Felix is correct. All processor on the NiFi canvas execute their code as the user who owns the NIFi JVM process. Authenticated user are granted authorizations needed to build dataflows but that user is in no way tied to execution of the added processors code.

-

However, processors that utilize connection authentication methods like Kerberos, username/password, etc.. will connect to the target system using those credentials provided in the processor configuration.

-

Processors where not authentictaion credentials are provided will exececute as the NiFi service user

Explorer

@Felix Albani & @Matt Clarke :

Thanks both for your answers. If I correctly understood, "dynamic" delegation is not an option... My use case is as follow :

  • user John logs in NiFi and runs a pipeline with a processor, e.g. PutHDFS which write into `/some/dir` which is associated with a Ranger policy granting write permission to user John.
  • user James logs in NiFi and runs the same pipeline with a processor, e.g. PutHDFS which write into `/some/dir` which is associated with a Ranger policy denying write permission to user James.

Is this possible to achieve this kind of use case? Basically, the credentials used by the processor depends on the logged in user.

Master Guru

@David D

-

That is not possible. Every component added in NiFi is executed by the service user (this is the user running the NiFi JVM)

-

NiFI authorizations simply control what a users can access the canvas and what types of actions they can perform while logged in to the canvas. A logged in user has no association to the processors themselves.

-

Thanks,

Matt

Explorer

@Matt Clarke

Thanks for the clarification. I'm quite new to Hadoop security and Kerberos, so I was wondering if there is some architectural issue to implement what I was talking about. That doesn't seem very difficult code-wise, I just had a quick look and found that Kerberos authentication is done in AbstractHadoopProcessor#resetHDFSResources : I just need to get the logged in user from a processor and do some Kerberos delegation.

Master Guru

@David D

Where would the processor then store that information?

-

There is no notion of a logged in user in NiFi. Every single request made to the NiFi UI requires authorization. Before a user request can be checked for proper authorization, the user must be successfully authenticated. The default user authentication is via ssl certificates. In this case the user certificate is expected in every ret endpoint request. For other authentication request like ldap or kerberos, the user is issued a token upon successful authentication which the browser stores. That token is only valid for a configured amount of time before user must log in again. That token must then be included in every rest endpoint request which will trigger an authorization check. So there really is no place for NiFi code to check for what user is "logged in". Not to mention a user logs in to a specific node in a Nifi cluster. How would the same processor executing on other nodes in the cluster know what user to use?

-

What user would that processor use next time it is scheduled to run if that user no longer has a valid token?

-

What happens if NiFi ended up being restarted?

-

There are a lot of things to consider there. None of which NiFI at its core was designed to handle.

-

Thanks,

Matt

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.