Created on 06-10-2019 10:38 PM - edited 08-17-2019 02:15 PM
OBJECTIVE:
Provide a quick-start guide for using the Jolt language within a NiFi JoltTransform (JoltTransformJSON or JoltTransformRecord).
OVERVIEW:
The NiFi JoltTransform uses the powerful Jolt language to parse JSON. Combined with the NiFi Schema Registry, this gives NiFi the ability to traverse, recurse, transform, and modify nearly any data format that can be described in AVRO or, using JSON as an intermediary step.
Although the language itself is open-source, and some documentation is available in the JavaDoc, this article can provide a starting point for understanding basic Jolt operations.
PREREQUISITES:
HDF 3.0 or later (NiFi 1.2.0.3 or later)
BASICS OF JOLT:
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "shift", "spec": { "breadbox": "counterTop" } } ]
Output:
{ "counterTop": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" }, "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } }
Spec:
[ { "operation": "default", "spec": { "counterTop": { "loaf1": { "slices": [ "slice1", "slice2", "slice3", "slice4 ] } } } } ] }
Output:
{ "counterTop" : { "loaf1" : { "type" : "white", "slices" : [ "slice1", "slice2", "slice3", "slice4" ] }, "loaf2" : { "type" : "wheat" }, "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white", "slices": [ "slice1", "slice2", "slice3", "slice4" ] }, "loaf2": { "type": "wheat" }, "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "cardinality", "spec": { "counterTop": { "loaf1": { "slices": "ONE" } } } } ]
Output:
{ "counterTop" : { "loaf1" : { "type" : "white", "slices" : "slice1" }, "loaf2" : { "type" : "wheat" }, "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white", "slices": "slice1" }, "loaf2": { "type": "wheat" }, "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "remove", "spec": { "counterTop": { "loaf2": "", "jar1": "" } } } ]
Output:
{ "counterTop" : { "loaf1" : { "type" : "white", "slices" : "slice1" }, "jar2" : { "contents" : "jelly" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white", "slices": "slice1" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "modify-overwrite-beta", "spec": { "counterTop": { "jar2": { "contents": "=toUpper" } } } } ]
Output:
{ "counterTop" : { "loaf1" : { "type" : "white", "slices" : "slice1" }, "jar2" : { "contents" : "JELLY" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white", "slices": "slice1" }, "jar2": { "contents": "JELLY" } } }
Spec:
[ { "operation": "sort" } ]
Output:
{ "counterTop" : { "jar2" : { "contents" : "JELLY" }, "loaf1" : { "slices" : "slice1", "type" : "white" } } }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "shift", "spec": { "*": "counterTop" } } ]
Output:
{ "counterTop" : [ { "loaf1" : { "type" : "white" }, "loaf2" : { "type" : "wheat" } }, { "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly" } } ] }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "expiration": { "*-*-*": { "$(0,3)": "expiry.year" } } } } } } ]
Output:
{ "expiry" : { "year" : "2019" } }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "contents": "garbage1", "@0": "garbage2", "@1": "garbage3", "@2": "garbage4" } } } } ]
Output:
{ "garbage0" : { "contents" : "jelly", "expiration" : "25-APR-2019" }, "garbage1" : "jelly", "garbage2" : { "contents" : "jelly", "expiration" : "25-APR-2019" }, "garbage3" : { "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly", "expiration" : "25-APR-2019" } }, "garbage4" : { "breadbox" : { "loaf1" : { "type" : "white" }, "loaf2" : { "type" : "wheat" } }, "fridge" : { "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly", "expiration" : "25-APR-2019" } } } }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "contents": "&0" } } } } ]
Output:
{ "contents" : "jelly" }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "contents": "&1" } } } } ]
Output:
{ "jar2" : "jelly" }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "contents": "&(2,0)" } } } } ]
Output:
{ "fridge" : "jelly" }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "*": { "jar*": { "@(0,contents)": "Things in jars" } } } } ]
Output:
{ "Things in jars" : [ "peanut butter", "jelly" ] }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "*": { "jar*": { "$0": "List of jars" } } } } ]
Output:
{ "List of jars" : [ "jar1", "jar2" ] }
Chained Spec:
[ { "operation": "shift", "spec": { "particles": ["particles-orig", "particles-0", "particles-1", "particles-2", "particles-3", "particles-4"], "timestamp": "ts", "*": "&" } }, { "operation": "shift", "spec": { "particles-orig": "particles-orig", "particles-0": { "*;*;*;*;*": { "$(0,1)": "tmp.particle1[]", "$(0,2)": "tmp.particle2[]", "$(0,3)": "tmp.particle3[]", "$(0,4)": "tmp.particle4[]", "$(0,5)": "tmp.particle5[]" } }, "particles-1": { "*;*;*;*": { "$(0,1)": "tmp.particle1[]", "$(0,2)": "tmp.particle2[]", "$(0,3)": "tmp.particle3[]", "$(0,4)": "tmp.particle4[]" } }, "particles-2": { "*;*;*": { "$(0,1)": "tmp.particle1[]", "$(0,2)": "tmp.particle2[]", "$(0,3)": "tmp.particle3[]" } }, "particles-3": { "*;*": { "$(0,1)": "tmp.particle1[]", "$(0,2)": "tmp.particle2[]" } }, "particles-4": "tmp.particle1[]", "*": "&" } }, { "operation": "shift", "spec": { "tmp": { "*": { "0": { "*,*,*,*": { "@(4,runid)": "particles.[#4].runid", "@(4,ts)": "particles.[#4].ts", "$(0,1)": "particles.[#4].Xloc", "$(0,2)": "particles.[#4].Yloc", "$(0,3)": "particles.[#4].Xdim", "$(0,4)": "particles.[#4].Ydim" } } } }, "*": "&" } }, { "operation": "remove", "spec": { "particles-orig": "" } } ]
REFERENCES:
RELATED POSTS:
Created on 07-01-2019 01:43 PM
Hello,
thanks for perfect examples!
What would be the jolt specification for this input/output look, please? The number of tags can be dynamic and delimiter is always colon char.
Input:
{ "log": { "vector": "tag1:tag2:tag3" } }
Spec:
???
Output:
{ "tags" : [ "tag1","tag2","tag3" ] }
Thanks
Created on 07-01-2019 05:20 PM
I really appreciated your work. I bookmarked this page.
Created on 06-02-2020 03:22 AM
[
{
"operation": "shift",
"spec": {
"log": {
"vector": "tags"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"tags": "=split(':',@0)"
}
}
]
Created on 09-27-2020 08:08 AM
Hi @wcbdata
Can you explain the usage of '#' in the spec you used above:
{ "operation": "shift", "spec": { "tmp": { "*": { "0": { "*,*,*,*": { "@(4,runid)": "particles.[#4].runid", "@(4,ts)": "particles.[#4].ts", "$(0,1)": "particles.[#4].Xloc", "$(0,2)": "particles.[#4].Yloc", "$(0,3)": "particles.[#4].Xdim", "$(0,4)": "particles.[#4].Ydim" } } } }, "*": "&" } }
Created on 02-07-2021 06:45 PM
Hi,
Thanks for the good explanation.
What would be the jolt specification for the following input/output.
There are two input Json:
1st Json input:
{ "Name" : "Alvin", "Status" : "Single", "Life" : [ { "Sport" : "Swimming", "Singing" : "K-box", "Food" : "Burger", "Alcohol" : "Rum" }, { "Sport" : "Boxing", "Singing" : "party world", "Food" : "Chicken Wing", "Alcohol" : "Whisky" }, { "Sport" : "Running", "Singing" : "KTV", "Food" : "Muffin", "Alcohol" : "Martel" }] } |
2nd Json input:
{ "Name" : "Alvin", } |
This two Json message input should go to a same JoltTransformJson processor and come out with the following output:
1st Json output:
{ "Name" : "Alvin", "Status" : "Single", "Sport" : [ "Swimming", "Boxing", "Running"], "Singing" : [ "K-box", "party world" , "KTV"], "Food" : [ "Burger", "Chicken Wing" , "Muffin"], "Alcohol" : [ "Rum", "Whisky", "Martel"] } |
2nd Json output:
{ "Name" : "Alvin", "Status" : "Single", "Sport" : [ "Swimming"], "Singing" : [ "K-box"], "Food" : [ "Burger"], "Alcohol" : [ "Rum"] } |
How can I configure the JoltTransformJson processor to get the above output? Or is there any other ways to do it? Please advise with step and example. appreciate
Created on 07-20-2023 11:53 PM
Hi,
I am a newbiee to nifi and its processors.
Want to understand what would be the jolt specification for the following input/output? Or can anyone suggest any other processor .
Input JSON:
{"cells": {"deviceindicators:1234_32456_789023":"0", "deviceindicators:5678_89213_875943":"110"}}
Output JSON:
{"cells": {"1234_32456_789023":"0", "5678_89213_875943":"110"}}
Want to remove "deviceindicators:" from the key using JoltJSONtransformation.. Please advice
Created on 07-21-2023 02:51 AM
@VaibhavK, Welcome to the Cloudera Community. As this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
Created on 07-22-2023 01:40 AM
Thanks for the Awesome information!