Created 08-22-2022 07:36 AM
Within my NiFi flow I have a number of datasets, with various schemas where the 'key' of key value pairs are on camelCase but i want any incoming keys to be outputted in snake case, does anyone know what jolt spec i can use to achieve this?
Example input:
{"aRandomFieldName":"Random Value"}
Required Output:
{"a_random_field_name":"Random Value"}
Been struggling to achieve this using Jolt or replaceText processors, any assistance would be great.
Thanks in advance
Andy.
Created 08-26-2022 09:07 PM
@Griggsy ,
I don't know if there's a way to do exactly that with either JOLT or ReplaceText processors.
You can do it with ExecuteScript and the following Python script, though:
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
import json
import re
def to_snake_case(name):
return re.sub(r'(?<!^)(?=[A-Z])', '_', name).lower()
def convert_keys_to_snake_case(obj):
if isinstance(obj, list):
return [convert_keys_to_snake_case(o) for o in obj]
else:
return {to_snake_case(k): v for k, v in obj.items()}
class PyStreamCallback(StreamCallback):
def __init__(self):
pass
def process(self, inputStream, outputStream):
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
converted = convert_keys_to_snake_case(json.loads(text))
outputStream.write(bytearray(json.dumps(converted).encode('utf-8','ignore')))
flow_file = session.get()
if flow_file != None:
flow_file = session.write(flow_file, PyStreamCallback())
session.transfer(flow_file, REL_SUCCESS)
Cheers,
André
Created 08-22-2022 08:16 AM
I have stolen to spec below from here: https://stackoverflow.com/questions/54696540/jolt-transformation-lowercase-all-keys which works in lowercasing all keys, however I ideally need the hyphens between the words.
[
{
// unwrap the keys and values into literal
// "key" : "A", "value" : "b"
"operation": "shift",
"spec": {
"*": {
"$": "&1.key",
"@": "&1.value"
}
}
},
{
"operation": "modify-overwrite-beta",
"spec": {
"*": {
// Now that the origional key
// is on the "right hand side"
// lowercase it
"key": "=toLower"
}
}
},
{
// pivot back, the now lowercased keys
"operation": "shift",
"spec": {
"*": {
"value": "@(1,key)"
}
}
}
]
Created 08-26-2022 09:07 PM
@Griggsy ,
I don't know if there's a way to do exactly that with either JOLT or ReplaceText processors.
You can do it with ExecuteScript and the following Python script, though:
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
import json
import re
def to_snake_case(name):
return re.sub(r'(?<!^)(?=[A-Z])', '_', name).lower()
def convert_keys_to_snake_case(obj):
if isinstance(obj, list):
return [convert_keys_to_snake_case(o) for o in obj]
else:
return {to_snake_case(k): v for k, v in obj.items()}
class PyStreamCallback(StreamCallback):
def __init__(self):
pass
def process(self, inputStream, outputStream):
text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
converted = convert_keys_to_snake_case(json.loads(text))
outputStream.write(bytearray(json.dumps(converted).encode('utf-8','ignore')))
flow_file = session.get()
if flow_file != None:
flow_file = session.write(flow_file, PyStreamCallback())
session.transfer(flow_file, REL_SUCCESS)
Cheers,
André
Created 09-02-2022 03:04 AM
@araujo thank you! I had gotten around the issue by using a replaceText loop to search for capitals within' the keys 1 at a time and prepend underscores, followed by a jolt to lowercase all keys.
You're script returns the same result, very impressed. Thanks again.