Support Questions

Find answers, ask questions, and share your expertise

NiFi JoltTransformJSON to convert camelCase to snake_case

avatar
Contributor

Within my NiFi flow I have a number of datasets, with various schemas where the 'key' of key value pairs are on camelCase but i want any incoming keys to be outputted in snake case, does anyone know what jolt spec i can use to achieve this?

 

Example input:

{"aRandomFieldName":"Random Value"}

 

Required Output:

{"a_random_field_name":"Random Value"}

 

Been struggling to achieve this using Jolt or replaceText processors, any assistance would be great.

 

Thanks in advance

Andy.

1 ACCEPTED SOLUTION

avatar
Super Guru

@Griggsy ,

 

I don't know if there's a way to do exactly that with either JOLT or ReplaceText processors.

You can do it with ExecuteScript and the following Python script, though:

from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
import json
import re

def to_snake_case(name):
    return re.sub(r'(?<!^)(?=[A-Z])', '_', name).lower()

def convert_keys_to_snake_case(obj):
    if isinstance(obj, list):
        return [convert_keys_to_snake_case(o) for o in obj]
    else:
        return {to_snake_case(k): v for k, v in obj.items()}

class PyStreamCallback(StreamCallback):
    def __init__(self):
        pass

    def process(self, inputStream, outputStream):
        text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
        converted = convert_keys_to_snake_case(json.loads(text))
        outputStream.write(bytearray(json.dumps(converted).encode('utf-8','ignore')))

flow_file = session.get()
if flow_file != None:
    flow_file = session.write(flow_file, PyStreamCallback())
    session.transfer(flow_file, REL_SUCCESS)

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

View solution in original post

3 REPLIES 3

avatar
Contributor

I have stolen to spec below from here: https://stackoverflow.com/questions/54696540/jolt-transformation-lowercase-all-keys which works in lowercasing all keys, however I ideally need the hyphens between the words.

 

[

  {

    // unwrap the keys and values into literal

    // "key" : "A", "value" : "b"

    "operation": "shift",

    "spec": {

      "*": {

        "$": "&1.key",

        "@": "&1.value"

      }

    }

  },

  {

    "operation": "modify-overwrite-beta",

    "spec": {

      "*": {

        // Now that the origional key

        //  is on the "right hand side"

        //  lowercase it

        "key": "=toLower"

      }

    }

  },

  {

    // pivot back, the now lowercased keys

    "operation": "shift",

    "spec": {

      "*": {

        "value": "@(1,key)"

      }

    }

  }

]

avatar
Super Guru

@Griggsy ,

 

I don't know if there's a way to do exactly that with either JOLT or ReplaceText processors.

You can do it with ExecuteScript and the following Python script, though:

from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
import json
import re

def to_snake_case(name):
    return re.sub(r'(?<!^)(?=[A-Z])', '_', name).lower()

def convert_keys_to_snake_case(obj):
    if isinstance(obj, list):
        return [convert_keys_to_snake_case(o) for o in obj]
    else:
        return {to_snake_case(k): v for k, v in obj.items()}

class PyStreamCallback(StreamCallback):
    def __init__(self):
        pass

    def process(self, inputStream, outputStream):
        text = IOUtils.toString(inputStream, StandardCharsets.UTF_8)
        converted = convert_keys_to_snake_case(json.loads(text))
        outputStream.write(bytearray(json.dumps(converted).encode('utf-8','ignore')))

flow_file = session.get()
if flow_file != None:
    flow_file = session.write(flow_file, PyStreamCallback())
    session.transfer(flow_file, REL_SUCCESS)

 

Cheers,

André

--
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
Contributor

@araujo thank you! I had gotten around the issue by using a replaceText loop to search for capitals within' the keys 1 at a time and prepend underscores, followed by a jolt to lowercase all keys.

Griggsy_0-1662112973053.png

You're script returns the same result, very impressed. Thanks again.