Created 07-20-2018 09:12 PM
I have to migrate data from Prod to test .But, My prod data has sensitive information in the Json file. I want to replace values of dozen sensitive fields with random data. How can I accomplish that using NIFI
For example the following SSN value should be replaced with 11111, etc
,"ssn":"5010040000022626666666",
"name":"john" should be replaced with bob , etc
Your help is appreciated
Thanks
Created 07-21-2018 10:20 AM
Use update record processor configure record reader/writer controller services.
Add the sensitive values field names as dynamic properties then replace the field.value with your desired value.
For more reference regarding update record refer to this link.
If you are using Nifi < 1.2 then use replace text processor add matching regex that extracts the sensitive values then replace them with some constant values.
Created 07-23-2018 11:06 PM
Thanks for the information. I tried the first approach mentioned by you.It works fine. But, the problem is,it is applied to only first line of my Json file and skips the rest. I am applying this logic on the Merg Json file which has many records.But, for some reason,it applies to the first line
My reg expression is something like this replaceRegex(/account/id, '(^.*$)', '99999999999').
Please help! Thanks
Created 07-24-2018 12:00 AM
Make sure your input json messages matches with the schema that you have configured..
i have just tried with the below input json
[{"account":{"id":1}},{"account":{"id":2}}]
UpdateRecord dynamic property is
/account/id
replaceRegex(/account/id, '(^.*$)', '99999999999')
Avro schema
{ "type" : "record", "name" : "sch", "namespace" : "avro", "fields" : [ { "name" : "account", "type" : { "type" : "record", "name" : "account", "fields" : [ { "name" : "id", "type" : "long" } ] } } ] }
Output:
[{"account":{"id":99999999999}},{"account":{"id":99999999999}}]
Worked as expected and replaced both records in the array.
Created 07-24-2018 12:57 AM
I'm able to reproduce the same issue in NiFi < 1.7 and it got fixed in NiFi-1.7, refer to this jira link to addressing same issue.
in NiFi-1.7 with the same input json will produce output flowfile like below.
[{"account":{"id":99999999999}},{"account":{"id":99999999999}},{"account":{"id":99999999999}},{"account":{"id":99999999999}},{"account":{"id":99999999999}},{"account":{"id":99999999999}}]
In NiFi-1.6 we cannot read multiple array of json messages and to fix this issue,
Method1:
if you are having json array in each line then use SplitText processor to split the file into individual flowfiles for each message then feed the splits relationship to UpdateRecord processor.
(or)
Method2:
You can also use SplitContent processor with Byte Sequence Format Text with }] byte sequence, Some reference regarding split content processor,then feed the splits relationship to UpdateRecord processor.
in addition once you have split the content then you can feed the splits relation to Merge Record processor to merge these array of json flowfiles to make them as one json message, then we can use Update Record processor to work on array of json messages.
Created 07-24-2018 09:22 PM
Thanks a lot, Split text is working. One last thing, Now , I am trying to merge the files back after UpdateRecord is done.All the records in the merge is merging into one single line. I want them to be in the seperate line. Please let me know. Thanks
Created 07-25-2018 04:09 PM
One last thing. I just noticed My queries doesn't like [] at the end of each line. This is created by update record. How can I remove /avoid them ? Please let me know. Thanks for all your help
Created 07-25-2018 08:33 PM
@Shu I used couple of replace text processor to fix it. All good. Thanks
Created on 07-24-2018 10:00 PM - edited 08-17-2019 11:26 PM
That's an expected behavior from Merge Record processor in NiFi < 1.7, all the records after merge record processor will be in an array
[{},{},{}]
In NiFi-1.7+ you can write one line per object i.e
[{}]
[{}]
[{}]
If your desired output is like one line per object then use Merge Content processor with delimiter strategy as text then demarcator as shift+enter, then you will achieve the same output flowfile one line per object.
Output:
[{}]
[{}]
[{}]
Created 07-25-2018 05:02 AM
Thanks @Shu.
I tried Merge Content processor with delimiter strategy as text then demarcator as shift+enter.
Shift+Enter string is being added after every record. Here is the sample
Purchase":"1363.95"}]}}}]shift+enter[{"key":{"A
Ami doing something wrong. I am just using all the default options in the MergeContent except mentioned above