Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NiFi: Preparing JSON for use with json serde in Hive

NiFi: Preparing JSON for use with json serde in Hive

Explorer

In NiFi (1.5) I need the best way to prepare JSON so it can be parsed by the Hive json serde ('org.apache.hive.hcatalog.data.JsonSerDe).

I have an array of, here two, records in this format (from a REST API response).

[
  {
        "id": 1,
        "call_status": "OK",
        "result": 0.0239,
        "explanation": [
            "some_var",
            "another_var"
        ],
        "foo": "OK"
    },
    {
        "id": 2,
        "call_status": "OK",
        "result": 0.0239,
        "explanation": [
            "some_var",
            "another_var"
        ],
        "foo": "OK"
    }
]

It seems it should be transformed to this format for the serde to work, which I tried doing manually with succes. No array brackets and one record per line.

{ "id": 1, "call_status": "OK", "result": 0.0239, "explanation": [ "some_var", "another_var" ], "foo": "OK" }
{ "id": 2, "call_status": "OK", "result": 0.0239, "explanation": [ "some_var", "another_var" ], "foo": "OK" }

What is the recommended / most efficient way of doing this transformation? Preferably without having to input the schema, as it would be nice with a generic solution. If a schema _is_ required, let's say for Record processors, I'll live with that.

1 REPLY 1
Highlighted

Re: NiFi: Preparing JSON for use with json serde in Hive

Explorer

I've come up with an initial solution, but would like to hear better alternatives in case this doesn't perform/scale very well on larger inputs. Let me know how this can be expected to scale.

  1. FlattenJSON
  2. Remove outer array brackets with ReplaceContent regex replacing ^\[(.*)]$ with \1 (seems expensive wit regex for such a simple operation)
  3. Make each record object stand on a separate line with no comma separation with literal replace of },{ with }\n{

Remember, I'm on NiFi 1.5. Thanks.

Don't have an account?
Coming from Hortonworks? Activate your account here