Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to use Apache NiFi EvaluateJsonPath for JSON to CSV/Text extract

avatar
Rising Star

Hi all,

I am getting my arse kicked by the EvaluateJsonPath.

So the task is to be able to extract some json attribute values into a CSV format or a text format that will be used for inserting into file, db ,etc..

So here is what i got so far:

GetFile (it reads a json file) --> SplitJson --> EvaluateJsonPath --> PutFile (i will throw a merge in between after i get after the EvaluateJsonPath to work "understand how it work").

The json used is:

- this is just a dummy json data

{
  "results": [
    {
      "gender": "female",
      "name": {
        "title": "ms",
        "first": "juliette",
        "last": "schmitt"
      },
      "location": {
        "street": "7260 esplanade du 9 novembre 1989",
        "city": "le havre",
        "state": "lot",
        "postcode": 58491
      },
      "email": "juliette.schmitt@example.com",
      "login": {
        "username": "smallcat938",
        "password": "horizon",
        "salt": "fvaSIrep",
        "md5": "3d1a182db3002dc88cd54b258838aa89",
        "sha1": "71600ad7a6f65ec00fed7521bcf3d943a87a8172",
        "sha256": "cd9bfc086b5b83a935c99b496c27fc58716521a43c2630e286f87147e4b4dd8a"
      },
      "dob": "1959-08-16 23:11:26",
      "registered": "2010-07-16 20:27:45",
      "phone": "04-09-99-35-16",
      "cell": "06-87-17-31-45",
      "id": {
        "name": "INSEE",
        "value": "259755098491 56"
      },
      "picture": {
        "large": "https://randomuser.me/api/portraits/women/71.jpg",
        "medium": "https://randomuser.me/api/portraits/med/women/71.jpg",
        "thumbnail": "https://randomuser.me/api/portraits/thumb/women/71.jpg"
      },
      "nat": "FR"
    }
  ],
  "info": {
    "seed": "ca021f036c48503a",
    "results": 1,
    "page": 1,
    "version": "1.1"
  }
}

1- GetFile setup:

Getfile

23488-getfile.png

2- SplitJson setup:

SplitJson

23489-splitjson.png

3- EvaluateJsonPath

EvaluateJsonPath

23490-evaluatejsonpath.png

Note: this one is working!

Now adding a new property to the EvaluateJsonPath gets me stuck

ErrorJsonPaths

The documentation is not very illustrative on how this should be used and i am not an json expert.

I am trying to understand how the does the twitter template actually does it ?

twitt.png

-where multiple paths are declared.

thanks all

1 ACCEPTED SOLUTION

avatar

If you set the Destination to flowfile-content, you can have only one JSON Path expression. You could set the Destination to flowfile-attribute instead, then each JSON Path will be extracted to the named attribute value. If you need the results in the content of the Flow File, use a ReplaceText processor afterwards to collect the attribute values into the content.

View solution in original post

7 REPLIES 7

avatar

If you set the Destination to flowfile-content, you can have only one JSON Path expression. You could set the Destination to flowfile-attribute instead, then each JSON Path will be extracted to the named attribute value. If you need the results in the content of the Flow File, use a ReplaceText processor afterwards to collect the attribute values into the content.

avatar

Hi @Hellmar Becker,thanks for the info. But can you please give an example what should be the jsonpath like..?

avatar
Rising Star

Ok ,

23497-406989-big-hugs-elmo.jpg

I need to give you a big hug !!!

I was going out of my mind with this !!! Could not imagine why was not working !

thank you so much ! you made my day 🙂

avatar
New Contributor

@Adrian Oprea
Hi Adrian, I am trying your example with the JSON data you've provided but I am getting an error in SplitJSON processor as below:
2018-06-12 13:44:33,320 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.standard.SplitJson SplitJson[id=f57cffea-0163-1000-2a56-c76e963c1ea2] FlowFile StandardFlowFileRecord[uuid=70f23330-cb67-472f-9ccc-2d9f1ecb3d5f,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1528832661718-625, container=default, section=625], offset=0, length=18],offset=0,name=40718.toc,size=18] did not have valid JSON content.
Any pointers please?

avatar
Rising Star

Can you post you json file content ?

Test your json path definition at http://jsonpath.com/? , is easy and debug any issues with the json format.

avatar
New Contributor

Thanks for your reply. Please find attached the JSON file: json.jpeg
In the SplitJson processor, the JsonPath Expression is "$.results" so I can grab all the 25 features/columns inside of it. But where to provide the "$.info" so I can grab the seed, results, page, and version columns?

Thanks for your time.

avatar

Hi @Vikas Singh , SplitJson Processor is used to split Json Array and EvaluateJsonPath is used to extract Json fields as attribute or content. In your case: Step 1:Use EvaluateJsonPath Processor to extract info fields of Json. For example: is you want to extarct info fileds: .$info.seed,.$info.page,.$info.version,.$info.results and save it as Flowfile Attribute. Step 2:Use SplitJson Processor :$.results Step 3:USe EvaluateJsonPath Processor to extract Json Array fields.