Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

HDF - Flatten a xml file

SOLVED Go to solution
Highlighted

HDF - Flatten a xml file

Is it possible to flat xml file using NiFi? For example: Link

1 ACCEPTED SOLUTION

Accepted Solutions

Re: HDF - Flatten a xml file

Yes, check out these processors:

With approaches described in your link above and these processors one should be able to achieve it easily.

2 REPLIES 2

Re: HDF - Flatten a xml file

Yes, check out these processors:

With approaches described in your link above and these processors one should be able to achieve it easily.

Re: HDF - Flatten a xml file

New Contributor

Hi @Andrew Grande and @Neeraj Sabharwal

1) I was able to use the TransFormXml processor to convert the xml in the following format

<?xml version="1.0" encoding="UTF-8"?>

-<ROWSET> <ROW kind="element" name="Product" pid="" id="d1e1"> </ROW>

<ROW kind="attribute" name="Type" pid="d1e1" id="d1e1a1373">Laptop</ROW>

<ROW kind="element" name="Notebook" pid="d1e1" id="d1e3"> </ROW>

<ROW kind="attribute" name="Brand" pid="d1e3" id="d1e3a1403">HP</ROW>

<ROW kind="attribute" name="Model" pid="d1e3" id="d1e3a1938">Pavilion dv6-3132TX Notebook</ROW>

<ROW kind="element" name="Harddisk" pid="d1e3" id="d1e5">640 GB</ROW>

<ROW kind="element" name="Processor" pid="d1e3" id="d1e8">Intel Core i7</ROW>

<ROW kind="element" name="RAM" pid="d1e3" id="d1e11">4 GB</ROW>

<ROW kind="element" name="Notebook" pid="d1e1" id="d1e15"> </ROW>

<ROW kind="attribute" name="Brand" pid="d1e15" id="d1e15a1403">HP</ROW>

<ROW kind="attribute" name="Model" pid="d1e15" id="d1e15a1938">HP Pavilion dv6-3032TX Notebook</ROW> <ROW kind="element" name="Harddisk" pid="d1e15" id="d1e17">640 GB</ROW>

<ROW kind="element" name="Processor" pid="d1e15" id="d1e20">Intel Core i7</ROW>

<ROW kind="element" name="RAM" pid="d1e15" id="d1e23">6 GB</ROW>

<ROW kind="element" name="Notebook" pid="d1e1" id="d1e27"> </ROW>

<ROW kind="attribute" name="Brand" pid="d1e27" id="d1e27a1403">Toshiba</ROW>

<ROW kind="attribute" name="Model" pid="d1e27" id="d1e27a1938">Satellite A660/07R 3D Notebook</ROW> <ROW kind="element" name="Harddisk" pid="d1e27" id="d1e29">640 GB</ROW>

<ROW kind="element" name="Processor" pid="d1e27" id="d1e32">Intel Core i7</ROW>

<ROW kind="element" name="RAM" pid="d1e27" id="d1e35">4 GB</ROW>

<ROW kind="element" name="Notebook" pid="d1e1" id="d1e39"> </ROW>

<ROW kind="attribute" name="Brand" pid="d1e39" id="d1e39a1403">Toshiba</ROW>

<ROW kind="attribute" name="Model" pid="d1e39" id="d1e39a1938">Satellite A660/15J Notebook</ROW>

<ROW kind="element" name="Harddisk" pid="d1e39" id="d1e41">640 GB</ROW>

<ROW kind="element" name="Processor" pid="d1e39" id="d1e44">Intel Core i5</ROW>

<ROW kind="element" name="RAM" pid="d1e39" id="d1e47">6 GB</ROW>

</ROWSET>

2) my question is how to convert the following xquery in the sample for Nifi's EvaluateXQuery :

(In the example, oracle is using.)

SELECT x.*

FROM xml_test t ,

XMLTable('/ROWSET/ROW'

passing xmltransform(t.object_value, xmltype(:xsldoc))

columns node_id varchar2(100) path '@id' ,

node_name varchar2(30) path '@name' ,

node_value varchar2(2000) path 'text()' ,

parent_node_id varchar2(100) path '@pid' ,

node_kind varchar2(30) path '@kind'

) x ;