Support Questions

Find answers, ask questions, and share your expertise

NIFI Extract text from JSON (a bit complex JSON returned from an API)

avatar
Explorer

I'm using the NIFI ExtractText Processor and I'm trying to come up with the regular expression to extract values from a JSON String that is in the flowfile-content coming from a response of an API (of course I cannot change how the API responds).

 

JSON Data returned from the REST API is: 

{
  "Result" : {
    "SessionID" : null,
    "ResultData" : {
      "DATAENTITYNAME" : "List",
      "CURRENTCURSORPOSITION" : 2,
      "DATARECORD" : [ {
        "DATAFIELD" : [ {
          "FIELDNAME" : "name",
          "FIELDLABEL" : "Name",
          "DATATYPE" : "VARCHAR",
          "FIELDVALUE" : "JSMITH",
          "ISKEYFIELD" : true
        }, {
          "FIELDNAME" : "namedesc",
          "FIELDLABEL" : "Name Desc.",
          "DATATYPE" : "MIXVARCHAR",
          "FIELDVALUE" : "John Smith",
          "ISKEYFIELD" : false
        }, {
          "FIELDNAME" : "hireddate",
          "FIELDLABEL" : "Hired Date",
          "DATATYPE" : "DATE",
          "FIELDVALUE" : "01-JAN-2010",
          "ISKEYFIELD" : false
        }]      
}, {
        "DATAFIELD" : [ {
          "FIELDNAME" : "name",
          "FIELDLABEL" : "Name",
          "DATATYPE" : "VARCHAR",
          "FIELDVALUE" : "RSTONE",
          "ISKEYFIELD" : true
        }, {
          "FIELDNAME" : "namedesc",
          "FIELDLABEL" : "Name Desc.",
          "DATATYPE" : "MIXVARCHAR",
          "FIELDVALUE" : "Robert Stone",
          "ISKEYFIELD" : false
        }, {
          "FIELDNAME" : "hireddate",
          "FIELDLABEL" : "Hired Date",
          "DATATYPE" : "DATE",
          "FIELDVALUE" : "01-JAN-2011",
          "ISKEYFIELD" : false
        }]
      },     
},
    "InfoAlert" : null,
    "WarningAlert" : null
  },
  "ConfirmationAlert" : null,
  "ErrorAlert" : [ ]
}
 
How do I get all "name" and "namedesc"?  (in the example: JSMITH -> John Smith and RSTONE -> Robert Stone)
 
Thanks!!!!!
 
2 ACCEPTED SOLUTIONS

avatar
Super Guru

Hi @Paulito ,

The easiest way I can think of is to do this in two Processors:

 

1- JoltTransformJson: Allows you to transform your json by simplifying it into an array or records where each record has list of fieldname:fieldvalue. To achieve this you need to provide the following jolt spec in the "Jolt Specification" property of the processor:

[
  {
    "operation": "shift",
    "spec": {
      "Result": {
        "ResultData": {
          "DATARECORD": {
            "*": {
              "DATAFIELD": {
                "*": {
                  "FIELDVALUE": "[&3].@(1,FIELDNAME)"
                }
              }
            }
          }
        }
      }
    }
  }
]

 

Basically the spec above will give you the following json based on the provided input:

 

[ {
  "name" : "JSMITH",
  "namedesc" : "John Smith",
  "hireddate" : "01-JAN-2010"
}, {
  "name" : "RSTONE",
  "namedesc" : "Robert Stone",
  "hireddate" : "01-JAN-2011"
} ]

 

2- QueryRecord Processor: to allow you to select the fields you are interested in for the given API as follows. The query is just like sql query and you can either specify wildcard (*) for all fields or just list particular fields as follows:

SAMSAL_0-1689721462315.png

 

The out put of the QueryRecord will look like this:

[
	{
		"name": "JSMITH",
		"namedesc": "John Smith"
	},
	{
		"name": "RSTONE",
		"namedesc": "Robert Stone"
	}
]

 

 

Of course you can make this dynamic for each API by providing both the Jolt spec and the Query as flowfile attributes since both allow expression language (EL) in the value field.

 

You can also achieve the same result by using just the Jolt Transformation processor but the spec would be more complex and I dont want to overwhelm you in case you are new to it, but if you are interested in that let me know and I will be happy to provide you the spec.

 

If this helps please accept solution.

Thanks

 

 

 

 

 

 

 

View solution in original post

avatar
Super Guru

Hi @Paulito,

I noticed you are using JsonPathReader vs JsonTreeReader which what I used. I'm not if the Path Reader is the right choice and if so how to configure correctly. I tried it in my test environment against your input with different JSONPath config and it did not work for me. Can you use JsonTreeReader instead?

View solution in original post

12 REPLIES 12

avatar
Explorer

@SAMSAL , here the information requested

Paulito_0-1689832452971.png

 

Paulito_1-1689832479351.png

 

Paulito_2-1689832501506.png

 

Thanks again for your help and time.

Cheers

avatar
Super Guru

Hi @Paulito,

I noticed you are using JsonPathReader vs JsonTreeReader which what I used. I'm not if the Path Reader is the right choice and if so how to configure correctly. I tried it in my test environment against your input with different JSONPath config and it did not work for me. Can you use JsonTreeReader instead?

avatar
Explorer

@SAMSAL my bad !!! I start to be old i need better glasses 🙂

You are right it works fantastically !!!

THANKS!!!!!