Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

How to use Apache NiFi EvaluateJsonPath for JSON to CSV/Text extract

Contributor

Hi all,

I am getting my arse kicked by the EvaluateJsonPath.

So the task is to be able to extract some json attribute values into a CSV format or a text format that will be used for inserting into file, db ,etc..

So here is what i got so far:

GetFile (it reads a json file) --> SplitJson --> EvaluateJsonPath --> PutFile (i will throw a merge in between after i get after the EvaluateJsonPath to work "understand how it work").

The json used is:

- this is just a dummy json data

{
  "results": [
    {
      "gender": "female",
      "name": {
        "title": "ms",
        "first": "juliette",
        "last": "schmitt"
      },
      "location": {
        "street": "7260 esplanade du 9 novembre 1989",
        "city": "le havre",
        "state": "lot",
        "postcode": 58491
      },
      "email": "juliette.schmitt@example.com",
      "login": {
        "username": "smallcat938",
        "password": "horizon",
        "salt": "fvaSIrep",
        "md5": "3d1a182db3002dc88cd54b258838aa89",
        "sha1": "71600ad7a6f65ec00fed7521bcf3d943a87a8172",
        "sha256": "cd9bfc086b5b83a935c99b496c27fc58716521a43c2630e286f87147e4b4dd8a"
      },
      "dob": "1959-08-16 23:11:26",
      "registered": "2010-07-16 20:27:45",
      "phone": "04-09-99-35-16",
      "cell": "06-87-17-31-45",
      "id": {
        "name": "INSEE",
        "value": "259755098491 56"
      },
      "picture": {
        "large": "https://randomuser.me/api/portraits/women/71.jpg",
        "medium": "https://randomuser.me/api/portraits/med/women/71.jpg",
        "thumbnail": "https://randomuser.me/api/portraits/thumb/women/71.jpg"
      },
      "nat": "FR"
    }
  ],
  "info": {
    "seed": "ca021f036c48503a",
    "results": 1,
    "page": 1,
    "version": "1.1"
  }
}

1- GetFile setup:

Getfile

23488-getfile.png

2- SplitJson setup:

SplitJson

23489-splitjson.png

3- EvaluateJsonPath

EvaluateJsonPath

23490-evaluatejsonpath.png

Note: this one is working!

Now adding a new property to the EvaluateJsonPath gets me stuck

ErrorJsonPaths

The documentation is not very illustrative on how this should be used and i am not an json expert.

I am trying to understand how the does the twitter template actually does it ?

twitt.png

-where multiple paths are declared.

thanks all

1 ACCEPTED SOLUTION

If you set the Destination to flowfile-content, you can have only one JSON Path expression. You could set the Destination to flowfile-attribute instead, then each JSON Path will be extracted to the named attribute value. If you need the results in the content of the Flow File, use a ReplaceText processor afterwards to collect the attribute values into the content.

View solution in original post

7 REPLIES 7

If you set the Destination to flowfile-content, you can have only one JSON Path expression. You could set the Destination to flowfile-attribute instead, then each JSON Path will be extracted to the named attribute value. If you need the results in the content of the Flow File, use a ReplaceText processor afterwards to collect the attribute values into the content.

Hi @Hellmar Becker,thanks for the info. But can you please give an example what should be the jsonpath like..?

Contributor

Ok ,

23497-406989-big-hugs-elmo.jpg

I need to give you a big hug !!!

I was going out of my mind with this !!! Could not imagine why was not working !

thank you so much ! you made my day 🙂

New Contributor

@Adrian Oprea
Hi Adrian, I am trying your example with the JSON data you've provided but I am getting an error in SplitJSON processor as below:
2018-06-12 13:44:33,320 ERROR [Timer-Driven Process Thread-3] o.a.nifi.processors.standard.SplitJson SplitJson[id=f57cffea-0163-1000-2a56-c76e963c1ea2] FlowFile StandardFlowFileRecord[uuid=70f23330-cb67-472f-9ccc-2d9f1ecb3d5f,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1528832661718-625, container=default, section=625], offset=0, length=18],offset=0,name=40718.toc,size=18] did not have valid JSON content.
Any pointers please?

Contributor

Can you post you json file content ?

Test your json path definition at http://jsonpath.com/? , is easy and debug any issues with the json format.

New Contributor

Thanks for your reply. Please find attached the JSON file: json.jpeg
In the SplitJson processor, the JsonPath Expression is "$.results" so I can grab all the 25 features/columns inside of it. But where to provide the "$.info" so I can grab the seed, results, page, and version columns?

Thanks for your time.

Hi @Vikas Singh , SplitJson Processor is used to split Json Array and EvaluateJsonPath is used to extract Json fields as attribute or content. In your case: Step 1:Use EvaluateJsonPath Processor to extract info fields of Json. For example: is you want to extarct info fileds: .$info.seed,.$info.page,.$info.version,.$info.results and save it as Flowfile Attribute. Step 2:Use SplitJson Processor :$.results Step 3:USe EvaluateJsonPath Processor to extract Json Array fields.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.