Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

NIFI: Transformation into new XML structure.

Highlighted

NIFI: Transformation into new XML structure.

New Contributor

Folks,


I am struggling to get to grips with JOLT/AVRO to convert files into XML.


My flow is:


  1. ConsumeKAFTA (KAFTA queue consumes JSON based files from filebeat, the flowfile created is a plain/text file.)

  2. UpdateAttribute. (Creates an attribute mimie.type = application/json

  3. AttrbutesToJson

  4. EvaluateJsonPath (I want to route later on onm some NIFI attributes.)

  5. Relacetext (Remove @ from any fields)

  6. JoltTransform (Converts a number of fields to different names, eg: timestamp -> TimeCreated

  7. InferAvroSchema (To create a AVRO schems based on the input.


  8. ConvertToXML

    ReadReader = JsonTreeReader (Uses inferred avro schema from 7 above.)

    RecordWriter= XMLRecordSetWriter (Uses avroregistry, and a defined schema.)


The problem I have is that the JSON input will look very different to the XML output.


I have figured how to rename fields etc, however how to create new nested XML?


Eg: input: (Plain/text -> application/json)


{

“field1” : “xxxxxxxxxxxxxxxx”,

“field2” : “xxxxxxxxx”,

“field3”: {

“field4: “xxxxxxxxxxxxxxxx”,

“field5: “xxxxxxxxxxxxxxxx”

}

“field6: “xxxxxxxxxx”

}


Eg: output needs to: (xml)


<fielda>

<fieldb>

<field1>xxxxxxxxxxxxxx<field1/>

<fieldb/>

<fielda/>


How do I define a new structure in the AVRO schema I use in the output?


As fielda, fieldb do not exist in the input, if I ‘define’ them In the AVRO schema output, they have black structures.


Eg:


<fielda><fielda/>


There seems to a number of ways of converting JSON -> XML, so:


  1. Does the flow look generally correct?

  2. If correct, how to I define new nested structures in the XML output using AVRO?


Thanks


Marcus