Created on 11-08-201605:24 PM - edited 08-17-201908:24 AM
Introduction
Recently worked
with use case which required heavy xml processing. Instead of writing complex
custom code end up achieved everything easily with NiFi. I thought this will be
useful of someone interested for XML processing in NiFi. The document in
general covers the following.
Base64 Encoding and Decoding of XML
message.
Character Set conversion from UTF to
Ascii ISO-8859-1
XML validation against the XSD.
Split the XML into smaller chunks.
Transform XML to JSON.
Extract the content and outputs into
unique files based on content.
This is very generic XML processing flow which can be leveraged across
many business use cases which process xml data.
Apache NiFi Flow
In the sample demo scenario,
External
system sends the Base64 encoded XML data in file format which is read through
GetFile processor.
Next
Base64EncodeContent processor decoded the Base64 content.
Incoming
data in UTF-8 format with leading BOM bytes which gets converted to the ISO-8859-1
format using the ConvertCharacterSet processor.
XML content
is validated against the XML schema using ValidateXML processor.
The
validated XML fragment splits at the root’s children level into smaller XML
chunks.
The split
xml is converted into JSON object using the XSLT and further written into
individual files.
Every file
is named based on the unique identifier from the flow content.
Processor Configurations
Base64EncodeContent
ConvertCharacterSet
ValidateXml:
Value :/Users/mpandit/jdeveloper/mywork/ClaimProcess/ClaimProcess/Initiate_App.xsd