Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How can i parse string nested json with Nifi?

avatar
Explorer

Hello,

I have a JSON data in the ClickHouse database with a type of string. The JSON data looks like this:

 

[{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}}" }]

 

The output I get using Nifida ExecuteSqlRecord is like this. I can remove the \ character using the ReplaceText process. The output looks like this: 

 

[{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}}" }]

 

However, the value part for the content key is still a string. How can I make its type as JSON?

The desired output is like this:

 

[{ "Name": "Gta V" "Type": "Xyz" "content": {"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}} }]

 



7 REPLIES 7

avatar
Expert Contributor

HI @mstfo,

Take the output to UpdateRecord processor, and cast the desire field to any specific data type example

formula: ${field.value:ToInteger()}

Shakib M.

avatar
Explorer

I am currently using the 'executesqlrecord -> replacetext' (for unescapeJson) process. How should I use the 'updateRecord' processor here? Could you explain a bit more ?

avatar
Expert Contributor

Are you fetching and processing the records as it is ? or defining the json path (flatting the JSON) ?

 

[{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}}" }]

 

Shakib M.

avatar
Explorer

 

The content field in the source table is in string format. When I receive the data from the source, it comes like this:

[{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057" }}]}}" }]

 

 

 

After I fetch this data with NiFi, I need to use a Jolt transformation on the content field, but since the content field is in string format, I can't perform the transformation. My actual problem is to eliminate the string expression in the content field and convert it to the following format so that I can use Jolt for the content field:
 
 

 

[{ "Name": "Gta V" "Type": "Xyz" "content": {"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}} }]

 

avatar
Expert Contributor

Can you send the JOLT just to understand what you are exactly extracting?

for example if you are extracting @value  then something like "@value": "=toInteger" in JOLT can be written for typeconversion.

I belive type conversion can not be done in list.

Shakib M.

avatar
Explorer

"This is the output I obtained with executesqlrecord: [{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057" }}]}}" }] After executesqlrecord, I use ${'$1':unescapeJson()} in the replacetext processor to remove the backslash characters. Then my output looks like this: [{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057" }}]}}" }]

What I want is, after removing the backslashes, the nested JSON data in the content field should not come within double quotes. I want to learn the ways to remove it after ReplaceText. So, ultimately, the desired output is:

[{ "Name": "Gta V" "Type": "Xyz" "content": {"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}} }]

avatar
Explorer

Hi @mstfo ,

Firstly it seems that the output you have after the replacetext processor isn't really a json, because of the missing commas (' , ') between the fields.

If your output is exactly like this, it will be necessary to add those missing commas so then you can get a json like this:

[
  {
    "Name": "Gta V",
    "Type": "Xyz",
    "content": "{\"Game\":{\"Subject\":[{\"Time\":{\"@value\":\"201511021057\" }}]}}"
  }
]

To obtain a json like this you can either:

  • try to change the logic when you get the data from the database;
  • try to add the commas manually.

The first option is the quicker solution, but there is the possibility that maybe you can't change the input you receive from the db, so perhaps you have to manually change it yourself.

 

Anyway, in case you manage to obtain a correct json from your input, then you can manipulate the string using the functions in the beta operations in jolt (like the =split()) and then reassemble the wanted json.

I wrote a guide and in the very last example i did, there is a similar case, so if you want go there and give it a look -> JOLT guide .