Created 12-04-2017 01:08 AM
I have the following JSON array:
[ { "id": "6935895746", "type": "PushEvent", "actor": { "id": 32568916, "login": "bigajwiktoria", "display_login": "bigajwiktoria", "gravatar_id": "", "url": "https://api.github.com/users/bigajwiktoria", "avatar_url": "https://avatars.githubusercontent.com/u/32568916?" }, { "id": "6935895745", "type": "PushEvent", "actor": { "id": 463230, "login": "taylorotwell", "display_login": "taylorotwell", "gravatar_id": "", "url": "https://api.github.com/users/taylorotwell", "avatar_url": "https://avatars.githubusercontent.com/u/463230?" } ]
and I would like to convert it to a CSV table with three columns in the following order: id, type, actor.login:
"6935895746" , "PushEvent", "bigajwiktoria" "6935895745" , "PushEvent", "taylorotwell"
Created 12-04-2017 01:32 AM
Hi @Julià Delos,
there are two ways you can handle the scenario,
old way:
You can read the files, after that split json per message basis, evoulte Json Path to extract the attributes (id,type,login ) with values [$.id, $.type, $.actor.ligin ]
use replace text processor to replace the entire content of each message flow file with ${id},${type},${login} and concatenate the folwfiles and write the data.
the same has been documented in HCC KB
New way:
Use the record reader and writer, which should grammatically convert your json, more on this can be found at NiFi blog
Hope this helps !!
Created on 12-04-2017 01:42 AM - edited 08-17-2019 08:42 PM
You need to split the json array first by using
SplitJSON processor:-
JsonPath Expression property as
$.*
Configs:-
Input:-
[ { "id": "6935895746", "type": "PushEvent", "actor": { "id": 32568916, "login": "bigajwiktoria", "display_login": "bigajwiktoria", "gravatar_id": "", "url": "https://api.github.com/users/bigajwiktoria", "avatar_url": "https://avatars.githubusercontent.com/u/32568916?" } }, { "id": "6935895745", "type": "PushEvent", "actor": { "id": 463230, "login": "taylorotwell", "display_login": "taylorotwell", "gravatar_id": "", "url": "https://api.github.com/users/taylorotwell", "avatar_url": "https://avatars.githubusercontent.com/u/463230?" } } ]
Output:-
As in the above array we are having 2 messages so split json processor results 2 flowfiles
ff1:-
{ "id": "6935895746", "type": "PushEvent", "actor": { "id": 32568916, "login": "bigajwiktoria", "display_login": "bigajwiktoria", "gravatar_id": "", "url": "https://api.github.com/users/bigajwiktoria", "avatar_url": "https://avatars.githubusercontent.com/u/32568916?" } }
ff2:-
{ "id": "6935895745", "type": "PushEvent", "actor": { "id": 463230, "login": "taylorotwell", "display_login": "taylorotwell", "gravatar_id": "", "url": "https://api.github.com/users/taylorotwell", "avatar_url": "https://avatars.githubusercontent.com/u/463230?" } }
Then use Evaluate Json Path processor:-
in this processor we are extracting the json message values to attributes.
Change property
Destination to
flowfile-attribute
Add properties
actor-login
$.actor.login
id
$.id
type
$.type
Configs:-
Replace Text processor:-
Now we are replacing the entire json message with our extracted attributes in Replace Text processor.
Search Value
(?s)(^.*$)
Replacement Value
"${id}","${type}","${actor-login}"
Character Set
UTF-8
Replacement Strategy
Always Replace
Evaluation Mode
Entire text
Configs:-
now your flowfile will have your required output as content.
Flow:-
SplitJson(split relation) //splitting Json Array to individual messages-->EvaluateJSONPath(Match Relation) //extracting the required values from json message and adding them as flowfile attributes --> Replace Text (Success) //Replacing flowfile contents with the required values
If the answer addressed your question, Then Click on Accept button below, that would be great help to community users if they are facing similar kind of issues.
Created on 12-04-2017 03:54 PM - edited 08-17-2019 08:42 PM
@Shu and @bkosaraju thanks for your replies
So far I do not manage to go farther than the SplitJSON. I am using the following DataFlow model to test:
I see that from the SplitJSON processor comes out 2 files for 1 as input, however I do not know how to handle each individual file. I try to visualize this file by saving them however in the written file I only see the first record ff1, but I never manage to see the second. In the PutFile processor I get the following error:
I general I think that I miss some fundamental understanding about how this file flow works, but I could not manage to find the right documentation. As I understand the input JSON array is split in 2 individual files, but I do not know how to aggregate all the individual outputs to an individual CSV files.
Maybe you could point me out where I should look to?
Created on 12-04-2017 04:10 PM - edited 08-17-2019 08:42 PM
The issue is as we are having same file name for the 2 output files to resolve this issue you need to change the file name in
Update Attribute processor
By adding property as
filename
${UUID()}
Configs:-
So we are changing the filename of the flowfile to UUID as uuid is unique, by using uuid as your filename you wont get any issues in PUTFile processor.
Created 12-06-2017 07:48 PM
Split Json and then use ConverRecord
Created 03-28-2020 08:12 AM
can you please tell me how I can use cconvertRecord here after splitjson..
Created 03-28-2020 08:15 AM
I want single file in the output which having all the records from array