- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How can i parse string nested json with Nifi?
- Labels:
-
Apache NiFi
Created ‎04-02-2024 01:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
I have a JSON data in the ClickHouse database with a type of string. The JSON data looks like this:
[{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}}" }]
The output I get using Nifida ExecuteSqlRecord is like this. I can remove the \ character using the ReplaceText process. The output looks like this:
[{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}}" }]
However, the value part for the content key is still a string. How can I make its type as JSON?
The desired output is like this:
[{ "Name": "Gta V" "Type": "Xyz" "content": {"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}} }]
Created ‎04-02-2024 07:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
HI @mstfo,
Take the output to UpdateRecord processor, and cast the desire field to any specific data type example
formula: ${field.value:ToInteger()}
Created ‎04-02-2024 11:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am currently using the 'executesqlrecord -> replacetext' (for unescapeJson) process. How should I use the 'updateRecord' processor here? Could you explain a bit more ?
Created ‎04-03-2024 12:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Are you fetching and processing the records as it is ? or defining the json path (flatting the JSON) ?
[{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}}" }]
Created ‎04-03-2024 01:45 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The content field in the source table is in string format. When I receive the data from the source, it comes like this:
[{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057" }}]}}" }]
[{ "Name": "Gta V" "Type": "Xyz" "content": {"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}} }]
Created ‎04-03-2024 12:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Can you send the JOLT just to understand what you are exactly extracting?
for example if you are extracting @value then something like "@value": "=toInteger" in JOLT can be written for typeconversion.
I belive type conversion can not be done in list.
Created ‎04-04-2024 04:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"This is the output I obtained with executesqlrecord: [{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057" }}]}}" }] After executesqlrecord, I use ${'$1':unescapeJson()} in the replacetext processor to remove the backslash characters. Then my output looks like this: [{ "Name": "Gta V" "Type": "Xyz" "content": "{"Game":{"Subject":[{"Time":{"@value":"201511021057" }}]}}" }]
What I want is, after removing the backslashes, the nested JSON data in the content field should not come within double quotes. I want to learn the ways to remove it after ReplaceText. So, ultimately, the desired output is:
[{ "Name": "Gta V" "Type": "Xyz" "content": {"Game":{"Subject":[{"Time":{"@value":"201511021057"}}]}} }]
Created ‎06-05-2024 08:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @mstfo ,
Firstly it seems that the output you have after the replacetext processor isn't really a json, because of the missing commas (' , ') between the fields.
If your output is exactly like this, it will be necessary to add those missing commas so then you can get a json like this:
[
{
"Name": "Gta V",
"Type": "Xyz",
"content": "{\"Game\":{\"Subject\":[{\"Time\":{\"@value\":\"201511021057\" }}]}}"
}
]
To obtain a json like this you can either:
- try to change the logic when you get the data from the database;
- try to add the commas manually.
The first option is the quicker solution, but there is the possibility that maybe you can't change the input you receive from the db, so perhaps you have to manually change it yourself.
Anyway, in case you manage to obtain a correct json from your input, then you can manipulate the string using the functions in the beta operations in jolt (like the =split()) and then reassemble the wanted json.
I wrote a guide and in the very last example i did, there is a similar case, so if you want go there and give it a look -> JOLT guide .
