Created 04-20-2018 08:31 PM
Hi
How to search and remove NULL value containing fields in incoming flow files in nifi . I was used replace test processor but it was removing only particular field. Here the Incoming flow file is JSON file format . please provide your thoughts in theis regard
Created on 04-21-2018 04:09 AM - edited 08-17-2019 06:39 PM
One way of doing is to use Replace text processor to search for "NULL" in json record key/value and replace with empty string(blank), by using this way we can replace all NULL key/value pairs from flowfile content.
Method1:-
Input:-
[ { "color": "black", "category": "hue", "type": "NULL", "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "color": "NULL", "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "rgba": NULL, "hex": "#FF0" } }]
ReplaceText Configs:-
Search Value
("type": "NULL",|"color": "NULL",|"rgba": NULL,) //search for all possible NULL key/value pairs(| is used as OR)
Replacement Value
//replace the with empty
Character Set
UTF-8
Maximum Buffer Size
1 MB //needs to change the size if your flowfile is more than 1 MB
Replacement Strategy
Regex Replace
Output From Replace Text Processor:-
[ { "color": "black", "category": "hue", "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "hex": "#FF0" } }]
All NULL values are replaced from the Flowfile Content.
(OR)
Method2:-
This way we are using ConvertRecord processor with JsonTreeReader as Record Reader and JsonSetWriter as Record Writer with the below property as
Suppress Null Values
Always Suppress //if the record having null value then we are not writing that into Output Flowfile.
Flow:-
GenerateFlowFile:-
This processor is used to generate sample records
Custom text property as
[ { "color": "black", "category": "hue", "type": "NULL", "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "color": "NULL", "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "rgba": NULL, "hex": "#FF0" } }]
add new property
schema.name
sch
ReplaceText:-
As we are going to use ConvertRecord processor and your input file having "NULL",NULL as values ,but these are not treated as null in Avro Schema Registry. so we are going to preparing json records which are going to understand by ConvertRecord Processor.
---
Search Value
(NULL|"NULL") //search for NULL (or) "NULL" in the flowfile content
Replacement Value
null //replace with null
Maximum Buffer Size
1 MB
Replacement Strategy
Regex Replace
Evaluation Mode
Entire text
Output:-
[ { "color": "black", "category": "hue", "type": null, "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "color": null, "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "rgba": null, "hex": "#FF0" } }]
All the "NULL",NULL and replaced with null in the flowfile content.
ConvertRecord Processor:-
Record Reader/Writer controller services are configured and enable the controller service and add AvroSchemaRegistry as
we need to configure the avro schema with can match your input json record.
JsontreeReader Configs:-
ArrayAvroSchemaRegistry Configs:-
sch
{ "type" : "record", "name" : "schema", "namespace" : "avroschema", "fields" : [ { "name" : "color", "type" : ["null","string"] }, { "name" : "category", "type" : "string" }, { "name" : "type", "type" : ["null","string"] }, { "name" : "code", "type" : { "type" : "record", "name" : "code", "fields" : [ { "name" : "rgba", "type" : { "type" : "array", "items" : ["null","int"] } }, { "name" : "hex", "type" : ["null","string"] }] } } ] }
JsonSetWriter Configs:-
Change the Supress Null Values to Always Suppress(we are not writing the null values for the json key's in the record)
Output:-
[{"color":"black","category":"hue","code":{"rgba":[255,255,255,1],"hex":"#000"}},{"category":"value","code":{"rgba":[0,0,0,1],"hex":"#FFF"}},{"color":"red","category":"hue","type":"primary","code":{"rgba":[],"hex":"#FF0"}}]
all the null values in the json records are not written in the output flowfile but the only issues is with rgba array even we are having null values for the array still jsonsetwriter writer an empty array for rgba
ReplaceText:-
Now we are replacing the empty rgba empty array in this processor
Search Value
"rgba":[], //search for this value
Replacement Value
//same as method1 replacetext configs
Maximum Buffer Size
1 MB
Replacement Strategy
Literal Replace
Evaluation Mode
Entire text
Output:-
[{"color":"black","category":"hue","code":{"rgba":[255,255,255,1],"hex":"#000"}},{"category":"value","code":{"rgba":[0,0,0,1],"hex":"#FFF"}},{"color":"red","category":"hue","type":"primary","code":{"hex":"#FF0"}}]
I have attached my template below, save/upload the template and change as per your needs.
-
If the Answer Addressed Your Question, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
Created 04-20-2018 10:23 PM
Hi
I have following JSON flow file and trying to remove null fields in the file and store into elastic search. here we need to remove Null containing fields from any type integer or character
[ { "color": "black", "category": "hue", "type": "NULL", "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "color": "NULL", "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "rgba": NULL, "hex": "#FF0" } }]
Flow used ==> Getfile ==>replacetest==>putelasticsearchrecords
Created 04-20-2018 10:25 PM
Hi
I have following JSON flow file and trying to remove null fields in the file and store into elastic search. here we need to remove Null containing fields from any type integer or character
[ { "color": "black", "category": "hue", "type": "NULL", "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "color": "NULL", "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "rgba": NULL, "hex": "#FF0" } }]
Flow used ==> Getfile ==>replacetest==>putelasticsearchrecords
Desired Output
[ { "color": "black", "category": "hue", "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "hex": "#FF0" } }]
Created on 04-21-2018 04:09 AM - edited 08-17-2019 06:39 PM
One way of doing is to use Replace text processor to search for "NULL" in json record key/value and replace with empty string(blank), by using this way we can replace all NULL key/value pairs from flowfile content.
Method1:-
Input:-
[ { "color": "black", "category": "hue", "type": "NULL", "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "color": "NULL", "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "rgba": NULL, "hex": "#FF0" } }]
ReplaceText Configs:-
Search Value
("type": "NULL",|"color": "NULL",|"rgba": NULL,) //search for all possible NULL key/value pairs(| is used as OR)
Replacement Value
//replace the with empty
Character Set
UTF-8
Maximum Buffer Size
1 MB //needs to change the size if your flowfile is more than 1 MB
Replacement Strategy
Regex Replace
Output From Replace Text Processor:-
[ { "color": "black", "category": "hue", "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "hex": "#FF0" } }]
All NULL values are replaced from the Flowfile Content.
(OR)
Method2:-
This way we are using ConvertRecord processor with JsonTreeReader as Record Reader and JsonSetWriter as Record Writer with the below property as
Suppress Null Values
Always Suppress //if the record having null value then we are not writing that into Output Flowfile.
Flow:-
GenerateFlowFile:-
This processor is used to generate sample records
Custom text property as
[ { "color": "black", "category": "hue", "type": "NULL", "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "color": "NULL", "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "rgba": NULL, "hex": "#FF0" } }]
add new property
schema.name
sch
ReplaceText:-
As we are going to use ConvertRecord processor and your input file having "NULL",NULL as values ,but these are not treated as null in Avro Schema Registry. so we are going to preparing json records which are going to understand by ConvertRecord Processor.
---
Search Value
(NULL|"NULL") //search for NULL (or) "NULL" in the flowfile content
Replacement Value
null //replace with null
Maximum Buffer Size
1 MB
Replacement Strategy
Regex Replace
Evaluation Mode
Entire text
Output:-
[ { "color": "black", "category": "hue", "type": null, "code": { "rgba": [255,255,255,1], "hex": "#000" } }, { "color": null, "category": "value", "code": { "rgba": [0,0,0,1], "hex": "#FFF" } }, { "color": "red", "category": "hue", "type": "primary", "code": { "rgba": null, "hex": "#FF0" } }]
All the "NULL",NULL and replaced with null in the flowfile content.
ConvertRecord Processor:-
Record Reader/Writer controller services are configured and enable the controller service and add AvroSchemaRegistry as
we need to configure the avro schema with can match your input json record.
JsontreeReader Configs:-
ArrayAvroSchemaRegistry Configs:-
sch
{ "type" : "record", "name" : "schema", "namespace" : "avroschema", "fields" : [ { "name" : "color", "type" : ["null","string"] }, { "name" : "category", "type" : "string" }, { "name" : "type", "type" : ["null","string"] }, { "name" : "code", "type" : { "type" : "record", "name" : "code", "fields" : [ { "name" : "rgba", "type" : { "type" : "array", "items" : ["null","int"] } }, { "name" : "hex", "type" : ["null","string"] }] } } ] }
JsonSetWriter Configs:-
Change the Supress Null Values to Always Suppress(we are not writing the null values for the json key's in the record)
Output:-
[{"color":"black","category":"hue","code":{"rgba":[255,255,255,1],"hex":"#000"}},{"category":"value","code":{"rgba":[0,0,0,1],"hex":"#FFF"}},{"color":"red","category":"hue","type":"primary","code":{"rgba":[],"hex":"#FF0"}}]
all the null values in the json records are not written in the output flowfile but the only issues is with rgba array even we are having null values for the array still jsonsetwriter writer an empty array for rgba
ReplaceText:-
Now we are replacing the empty rgba empty array in this processor
Search Value
"rgba":[], //search for this value
Replacement Value
//same as method1 replacetext configs
Maximum Buffer Size
1 MB
Replacement Strategy
Literal Replace
Evaluation Mode
Entire text
Output:-
[{"color":"black","category":"hue","code":{"rgba":[255,255,255,1],"hex":"#000"}},{"category":"value","code":{"rgba":[0,0,0,1],"hex":"#FFF"}},{"color":"red","category":"hue","type":"primary","code":{"hex":"#FF0"}}]
I have attached my template below, save/upload the template and change as per your needs.
-
If the Answer Addressed Your Question, Click on Accept button below to accept the answer, That would be great help to Community users to find solution quickly for these kind of issues.
Created 04-23-2018 01:11 PM
@shu , Thanks you very much for your detail explanation and options to solve the problem . Appreciate your effort . Thanks for your time and help
Created 12-11-2018 12:06 AM
I have used jolt transformer to transform json to json.
convertAvroToJSON => JoltTransformerJSON => ....
Here is sample specification
[ { "operation": "default", "spec": { "*": "XXXXX" } }, { "operation": "shift", "spec": { "*": { "XXXXX": null, "*": { "@1": "&2" } } } } ]