Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Hash field in Nifi (SHA2_512)

avatar

Hi, guys,

I have a data flow in Nifi that gets a file from the server and converts it to Avro to stream data to Hive. In this flow, I have some sensitive information that I need to hash (SHA2_512). I checked that Nifi has a couple of processors to work with hash, but it seems they only do this for the whole file. Is there a way to hash a specific field? Before converting to Avro, my flow files are coming from the server as fields separated by pipes ('|').

Thanks in advance!

Cheers

1 ACCEPTED SOLUTION

avatar
Master Guru

For algorithms that don't require secret information, I think it could be helpful to add such functions to NiFi Expression Language, although they don't exist right now (please feel free to file a Jira to add such a capability). With this capability you could use UpdateRecord to replace a field with its hashed value.

In the meantime, if you are comfortable with a scripting language such as Groovy, Javascript, Jython, JRuby, Clojure, or Lua, you could write a ScriptedLookupService that can "lookup" a specified field by returning the hash of its value. Then you can use LookupRecord with that service to get the hashed values. Alternatively you could implement a ScriptedRecordSetWriter that is configured to hash the values of the specified field(s), and then ConvertRecord with that writer.

View solution in original post

2 REPLIES 2

avatar
Master Guru

For algorithms that don't require secret information, I think it could be helpful to add such functions to NiFi Expression Language, although they don't exist right now (please feel free to file a Jira to add such a capability). With this capability you could use UpdateRecord to replace a field with its hashed value.

In the meantime, if you are comfortable with a scripting language such as Groovy, Javascript, Jython, JRuby, Clojure, or Lua, you could write a ScriptedLookupService that can "lookup" a specified field by returning the hash of its value. Then you can use LookupRecord with that service to get the hashed values. Alternatively you could implement a ScriptedRecordSetWriter that is configured to hash the values of the specified field(s), and then ConvertRecord with that writer.

avatar
Explorer

I did explore the way to hash and dynamically can extract data as csv or avro as default by developing a custom processor. You can download the processor from HERE.

The full source code is shared from here, feel free to contribute for further functionalities.

https://github.com/vanducng/hashing-columns-nifi-processor