Support Questions

Find answers, ask questions, and share your expertise

NiFi: How to put to HDFS with wire encryption?

avatar
Guru

In NiFi, how do I put data to HDFS with the data encrypted across the wire? NiFi cluster would be on separate cluster than HDFS.

1 ACCEPTED SOLUTION

avatar
Super Collaborator

is this like an on prem to cloud kind of a situation. Why would you want to just encrypt specific flows? For encryption to happen, the server has to enable it. ex. https goes to 443 instead of 80. Enable hadoop wire encryption and try sending data without setting the encryption settings in the hdfs-site when you call puthdfs. I think it will keep both secure and unsecure channels open.

View solution in original post

6 REPLIES 6

avatar
Super Collaborator

I think if you setup the cluster to use wire encryption, the name node and hdfs client will handle it for you. you just need to have the updated hdfs-site.xml file available on NiFi

avatar
Guru

I believe that is a cluster-wide setting for all client interactions with the the namenode/HDFS. I was hoping to isolate the encryption to specific flows on the NiFi side. Thoughts?

avatar
Super Collaborator

saw this in the pdfs-default xml

dfs.encrypt.data.transferfalseWhether or not actual block data that is read/written from/to HDFS should be encrypted on the wire. This only needs to be set on the NN and DNs, clients will deduce this automatically. It is possible to override this setting per connection by specifying custom logic via dfs.trustedchannel.resolver.class.

i guess it is doable.

avatar
Super Collaborator

so you would have to create a class that extends TrustedChannelResolver and overrides the isTrusted() method. then set it in hdfs properties dfs.trustedchannel.resolver.class. I have not handled this before, but if you are up to it i can assist.

avatar
Super Collaborator

is this like an on prem to cloud kind of a situation. Why would you want to just encrypt specific flows? For encryption to happen, the server has to enable it. ex. https goes to 443 instead of 80. Enable hadoop wire encryption and try sending data without setting the encryption settings in the hdfs-site when you call puthdfs. I think it will keep both secure and unsecure channels open.

avatar
Guru

I was hoping to be granular with encryption of sensitive data vs non-sensitive data flowing into HDFS for performance reasons. If performance differences are not that large ... it is no big deal, then.