Created on 06-10-2019 10:38 PM - edited 08-17-2019 02:15 PM
OBJECTIVE:
Provide a quick-start guide for using the Jolt language within a NiFi JoltTransform (JoltTransformJSON or JoltTransformRecord).
OVERVIEW:
The NiFi JoltTransform uses the powerful Jolt language to parse JSON. Combined with the NiFi Schema Registry, this gives NiFi the ability to traverse, recurse, transform, and modify nearly any data format that can be described in AVRO or, using JSON as an intermediary step.
Although the language itself is open-source, and some documentation is available in the JavaDoc, this article can provide a starting point for understanding basic Jolt operations.
PREREQUISITES:
HDF 3.0 or later (NiFi 1.2.0.3 or later)
BASICS OF JOLT:
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "shift", "spec": { "breadbox": "counterTop" } } ]
Output:
{ "counterTop": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" }, "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } }
Spec:
[ { "operation": "default", "spec": { "counterTop": { "loaf1": { "slices": [ "slice1", "slice2", "slice3", "slice4 ] } } } } ] }
Output:
{ "counterTop" : { "loaf1" : { "type" : "white", "slices" : [ "slice1", "slice2", "slice3", "slice4" ] }, "loaf2" : { "type" : "wheat" }, "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white", "slices": [ "slice1", "slice2", "slice3", "slice4" ] }, "loaf2": { "type": "wheat" }, "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "cardinality", "spec": { "counterTop": { "loaf1": { "slices": "ONE" } } } } ]
Output:
{ "counterTop" : { "loaf1" : { "type" : "white", "slices" : "slice1" }, "loaf2" : { "type" : "wheat" }, "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white", "slices": "slice1" }, "loaf2": { "type": "wheat" }, "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "remove", "spec": { "counterTop": { "loaf2": "", "jar1": "" } } } ]
Output:
{ "counterTop" : { "loaf1" : { "type" : "white", "slices" : "slice1" }, "jar2" : { "contents" : "jelly" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white", "slices": "slice1" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "modify-overwrite-beta", "spec": { "counterTop": { "jar2": { "contents": "=toUpper" } } } } ]
Output:
{ "counterTop" : { "loaf1" : { "type" : "white", "slices" : "slice1" }, "jar2" : { "contents" : "JELLY" } } }
Input:
{ "counterTop": { "loaf1": { "type": "white", "slices": "slice1" }, "jar2": { "contents": "JELLY" } } }
Spec:
[ { "operation": "sort" } ]
Output:
{ "counterTop" : { "jar2" : { "contents" : "JELLY" }, "loaf1" : { "slices" : "slice1", "type" : "white" } } }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly" } } }
Spec:
[ { "operation": "shift", "spec": { "*": "counterTop" } } ]
Output:
{ "counterTop" : [ { "loaf1" : { "type" : "white" }, "loaf2" : { "type" : "wheat" } }, { "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly" } } ] }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "expiration": { "*-*-*": { "$(0,3)": "expiry.year" } } } } } } ]
Output:
{ "expiry" : { "year" : "2019" } }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "contents": "garbage1", "@0": "garbage2", "@1": "garbage3", "@2": "garbage4" } } } } ]
Output:
{ "garbage0" : { "contents" : "jelly", "expiration" : "25-APR-2019" }, "garbage1" : "jelly", "garbage2" : { "contents" : "jelly", "expiration" : "25-APR-2019" }, "garbage3" : { "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly", "expiration" : "25-APR-2019" } }, "garbage4" : { "breadbox" : { "loaf1" : { "type" : "white" }, "loaf2" : { "type" : "wheat" } }, "fridge" : { "jar1" : { "contents" : "peanut butter" }, "jar2" : { "contents" : "jelly", "expiration" : "25-APR-2019" } } } }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "contents": "&0" } } } } ]
Output:
{ "contents" : "jelly" }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "contents": "&1" } } } } ]
Output:
{ "jar2" : "jelly" }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "fridge": { "jar2": { "contents": "&(2,0)" } } } } ]
Output:
{ "fridge" : "jelly" }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "*": { "jar*": { "@(0,contents)": "Things in jars" } } } } ]
Output:
{ "Things in jars" : [ "peanut butter", "jelly" ] }
Input:
{ "breadbox": { "loaf1": { "type": "white" }, "loaf2": { "type": "wheat" } }, "fridge": { "jar1": { "contents": "peanut butter" }, "jar2": { "contents": "jelly", "expiration": "25-APR-2019" } } }
Spec:
[ { "operation": "shift", "spec": { "*": { "jar*": { "$0": "List of jars" } } } } ]
Output:
{ "List of jars" : [ "jar1", "jar2" ] }
Chained Spec:
[ { "operation": "shift", "spec": { "particles": ["particles-orig", "particles-0", "particles-1", "particles-2", "particles-3", "particles-4"], "timestamp": "ts", "*": "&" } }, { "operation": "shift", "spec": { "particles-orig": "particles-orig", "particles-0": { "*;*;*;*;*": { "$(0,1)": "tmp.particle1[]", "$(0,2)": "tmp.particle2[]", "$(0,3)": "tmp.particle3[]", "$(0,4)": "tmp.particle4[]", "$(0,5)": "tmp.particle5[]" } }, "particles-1": { "*;*;*;*": { "$(0,1)": "tmp.particle1[]", "$(0,2)": "tmp.particle2[]", "$(0,3)": "tmp.particle3[]", "$(0,4)": "tmp.particle4[]" } }, "particles-2": { "*;*;*": { "$(0,1)": "tmp.particle1[]", "$(0,2)": "tmp.particle2[]", "$(0,3)": "tmp.particle3[]" } }, "particles-3": { "*;*": { "$(0,1)": "tmp.particle1[]", "$(0,2)": "tmp.particle2[]" } }, "particles-4": "tmp.particle1[]", "*": "&" } }, { "operation": "shift", "spec": { "tmp": { "*": { "0": { "*,*,*,*": { "@(4,runid)": "particles.[#4].runid", "@(4,ts)": "particles.[#4].ts", "$(0,1)": "particles.[#4].Xloc", "$(0,2)": "particles.[#4].Yloc", "$(0,3)": "particles.[#4].Xdim", "$(0,4)": "particles.[#4].Ydim" } } } }, "*": "&" } }, { "operation": "remove", "spec": { "particles-orig": "" } } ]
REFERENCES:
RELATED POSTS:
Created on 07-01-2019 01:43 PM
Hello,
thanks for perfect examples!
What would be the jolt specification for this input/output look, please? The number of tags can be dynamic and delimiter is always colon char.
Input:
{ "log": { "vector": "tag1:tag2:tag3" } }
Spec:
???
Output:
{ "tags" : [ "tag1","tag2","tag3" ] }
Thanks
Created on 07-01-2019 05:20 PM
I really appreciated your work. I bookmarked this page.
Created on 06-02-2020 03:22 AM
[
{
"operation": "shift",
"spec": {
"log": {
"vector": "tags"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"tags": "=split(':',@0)"
}
}
]
Created on 09-27-2020 08:08 AM
Hi @wcbdata
Can you explain the usage of '#' in the spec you used above:
{ "operation": "shift", "spec": { "tmp": { "*": { "0": { "*,*,*,*": { "@(4,runid)": "particles.[#4].runid", "@(4,ts)": "particles.[#4].ts", "$(0,1)": "particles.[#4].Xloc", "$(0,2)": "particles.[#4].Yloc", "$(0,3)": "particles.[#4].Xdim", "$(0,4)": "particles.[#4].Ydim" } } } }, "*": "&" } }
User | Count |
---|---|
758 | |
379 | |
316 | |
309 | |
268 |