Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NiFi JSON array to CSV file

avatar
New Contributor

I have the following JSON array:

[
  {
    "id": "6935895746",
    "type": "PushEvent",
    "actor": {
      "id": 32568916,
      "login": "bigajwiktoria",
      "display_login": "bigajwiktoria",
      "gravatar_id": "",
      "url": "https://api.github.com/users/bigajwiktoria",
      "avatar_url": "https://avatars.githubusercontent.com/u/32568916?"
    },
  {
    "id": "6935895745",
    "type": "PushEvent",
    "actor": {
      "id": 463230,
      "login": "taylorotwell",
      "display_login": "taylorotwell",
      "gravatar_id": "",
      "url": "https://api.github.com/users/taylorotwell",
      "avatar_url": "https://avatars.githubusercontent.com/u/463230?"
   }
]

and I would like to convert it to a CSV table with three columns in the following order: id, type, actor.login:

"6935895746" , "PushEvent", "bigajwiktoria"
"6935895745" , "PushEvent", "taylorotwell"
7 REPLIES 7

avatar
Super Collaborator

Hi @Julià Delos,

there are two ways you can handle the scenario,

old way:

You can read the files, after that split json per message basis, evoulte Json Path to extract the attributes (id,type,login ) with values [$.id, $.type, $.actor.ligin ]

use replace text processor to replace the entire content of each message flow file with ${id},${type},${login} and concatenate the folwfiles and write the data.

the same has been documented in HCC KB

New way:

Use the record reader and writer, which should grammatically convert your json, more on this can be found at NiFi blog

Hope this helps !!

avatar
Master Guru
@Julià Delos

You need to split the json array first by using

SplitJSON processor:-

JsonPath Expression property as

$.*

Configs:-

43795-split.png
Input:-

[
  {
    "id": "6935895746",
    "type": "PushEvent",
    "actor": {
      "id": 32568916,
      "login": "bigajwiktoria",
      "display_login": "bigajwiktoria",
      "gravatar_id": "",
      "url": "https://api.github.com/users/bigajwiktoria",
      "avatar_url": "https://avatars.githubusercontent.com/u/32568916?"
    }
  },
  {
    "id": "6935895745",
    "type": "PushEvent",
    "actor": {
      "id": 463230,
      "login": "taylorotwell",
      "display_login": "taylorotwell",
      "gravatar_id": "",
      "url": "https://api.github.com/users/taylorotwell",
      "avatar_url": "https://avatars.githubusercontent.com/u/463230?"
    }
  }
]

Output:-

As in the above array we are having 2 messages so split json processor results 2 flowfiles

ff1:-

  {
    "id": "6935895746",
    "type": "PushEvent",
    "actor": {
      "id": 32568916,
      "login": "bigajwiktoria",
      "display_login": "bigajwiktoria",
      "gravatar_id": "",
      "url": "https://api.github.com/users/bigajwiktoria",
      "avatar_url": "https://avatars.githubusercontent.com/u/32568916?"
    }
  }

ff2:-

{
    "id": "6935895745",
    "type": "PushEvent",
    "actor": {
      "id": 463230,
      "login": "taylorotwell",
      "display_login": "taylorotwell",
      "gravatar_id": "",
      "url": "https://api.github.com/users/taylorotwell",
      "avatar_url": "https://avatars.githubusercontent.com/u/463230?"
    }
  }

Then use Evaluate Json Path processor:-

in this processor we are extracting the json message values to attributes.

Change property

Destination to

flowfile-attribute

Add properties

actor-login

$.actor.login

id

$.id

type

$.type

Configs:-

43798-eval-json.png

Replace Text processor:-

Now we are replacing the entire json message with our extracted attributes in Replace Text processor.

Search Value

(?s)(^.*$)

Replacement Value

"${id}","${type}","${actor-login}"

Character Set

UTF-8

Replacement Strategy

Always Replace

Evaluation Mode

Entire text

Configs:-

43799-replace.png

now your flowfile will have your required output as content.

Flow:-

SplitJson(split relation) //splitting Json Array to individual messages-->EvaluateJSONPath(Match Relation) //extracting the required values from json message and adding them as flowfile attributes --> Replace Text (Success) //Replacing flowfile contents with the required values


If the answer addressed your question, Then Click on Accept button below, that would be great help to community users if they are facing similar kind of issues.


split.png

avatar
New Contributor

@Shu and @bkosaraju thanks for your replies

So far I do not manage to go farther than the SplitJSON. I am using the following DataFlow model to test:

43813-splitjson.png

I see that from the SplitJSON processor comes out 2 files for 1 as input, however I do not know how to handle each individual file. I try to visualize this file by saving them however in the written file I only see the first record ff1, but I never manage to see the second. In the PutFile processor I get the following error:

43815-error.png

I general I think that I miss some fundamental understanding about how this file flow works, but I could not manage to find the right documentation. As I understand the input JSON array is split in 2 individual files, but I do not know how to aggregate all the individual outputs to an individual CSV files.

Maybe you could point me out where I should look to?

avatar
Master Guru

@Julià Delos,

The issue is as we are having same file name for the 2 output files to resolve this issue you need to change the file name in

Update Attribute processor

By adding property as

filename

${UUID()}

Configs:-

43816-update.png

So we are changing the filename of the flowfile to UUID as uuid is unique, by using uuid as your filename you wont get any issues in PUTFile processor.

avatar
Master Guru

Split Json and then use ConverRecord

avatar
New Contributor

can you please tell me how I can use cconvertRecord here after splitjson..

avatar
New Contributor

I want single file in the output which having all the records from array