Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Guru

Objective

About a year ago, I wrote an article that detailed how to use the ConvertRecord processor and Record Reader/Writer controller services to easily convert a CVS file into various formats (JSON, Avro, XML):

https://community.hortonworks.com/content/kbentry/115311/convert-csv-to-json-avro-xml-using-convertr....

At the time, the CSV to XML conversion was done using a ScriptedRecordSetWriter. With the release of NiFi 1.7.0, the CSV to XML conversion can be done much more simply with the new XMLRecordSetWriter.

Environment

This tutorial was tested using the following environment and components:

  • Mac OS X 10.11.6
  • Apache NiFi 1.7.0

Convert CSV to XML

Support Files

Here is a template of the flow discussed in this tutorial:

convert-cvs-to-xml.xml

Here is the CSV file used in the flow:

users.txt

Note: Change the extension from .txt to .csv after downloading.

Demo Configuration

Import Template

Start NiFi. Import the provided template and add it to the canvas.

You should see the following flow:

78513-1-import-template.png

Note: After importing the template, create or make sure the directory paths for the GetFile and PutFile processors exist, confirm users.csv is in the input directory and enable all Controller Services before running the flow:

78514-2-enabled-controller-services.png

Flow Highlights

Details of the original flow are covered in my previous HCC article, but here are the key changes made:

ConvertRecord - CSVtoXML (ConvertRecord Processor)

Record Reader is still set to "CSVReader" but Record Writer is now set to the new "XMLRecordSetWriter":

78515-3-convertrecord-processor.png

XMLRecordSetWriter Controller Service

Here are the properties for this controller service:

78516-4-xmlrecordsetwriter-properties.png

Besides the default values, Schema Access Strategy property is set to "Use 'Schema Name' Property", Schema Registry is set to AvroSchemaRegistry and Name of Root Tag is set to "record".

Flow Results

Running this updated flow now produces a flowfile with XML contents:

78517-5-xml-flowfile-contents.png

1,814 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 07:07 AM
Updated by:
 
Contributors
Top Kudoed Authors