Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Complex XML To Hive table using NiFi

Complex XML To Hive table using NiFi

New Contributor

Hi Guys,

Sorry im new to NiFi. Im having problem right now. I already make a data flow that will insert my xml data to hive table.

But i encountered a problem when i was given another set of xml. This is only a sample format of my xml:

<customer>

<group>

<site>

<userline></userline>

<userline></userline>

<userline></userline>

</site>

</group>

<group>

<site>

<userline></userline>

<userline></userline>

<userline></userline>

</site>

</group>

</customer>

I have a table for customer, group and userline. Userline should be connected to Group table via GroupID. The Group Table should be connected to the Customer via CustomerID.

I can now save each customer.

The problem here is how can i save the multiple groups on group table and multiple userline on Userline table per customer and should be connected respectively.

My current dataflow is only getting the FIRST Group and FIRST Userline.

Is there any for loop i can do to NiFi?

And how can I prevent the error, because some Customer dont have any groups or userline.

I wish you can understand my question.

I attached the sample XML and my current xml.

10080-currentflow.png

Thank you.

2 REPLIES 2

Re: Complex XML To Hive table using NiFi

I know this is an old question, but I'll answer it in case someone else comes across it in future: I'd suggest you convert this XML into JSON, split out each into separate documents using Jolt and JsonPath, write into separate tables, and then do your join logic in Hive afterwards. I've just posted a guide on shredding XML which might be useful: https://community.hortonworks.com/articles/105547/nifi-xml-to-json-shredding-a-generalised-solution-...

Re: Complex XML To Hive table using NiFi

New Contributor

Hi

XML tree is a complex as they are hierarchical and you most likely want a flat structure for easier access of the data.

http://max.bback.se/index.php/2018/06/30/xml-to-tables-csv-with-nifi-and-groovy-part-2-of-2/

I just wrapped up the second article of this yesterday, and the code for this is available at GitHub link included in the article.

The article series describe the problem and is providing an implementation for the conversion from XML to CSV by flattening out the XML files, my example XML is flattened out into 4 tables, all depends on how many branches you have that is of the type 1 to many.

/Max