Support Questions
Find answers, ask questions, and share your expertise

What are the issues with using XML serde for reading XML data - other than being a little slow for huge datasets

 
3 REPLIES 3

Re: What are the issues with using XML serde for reading XML data - other than being a little slow for huge datasets

Issue with Hive when you load XMLs into Hive as a single column and use XPATH queries, as it becomes difficult in case of getting data very deep within XML.

Re: What are the issues with using XML serde for reading XML data - other than being a little slow for huge datasets

We are not loading all data into one column. We will load each xml element into one column using column.xpath.

Do you see any issues with it.

Re: What are the issues with using XML serde for reading XML data - other than being a little slow for huge datasets

@Narayana Ganta

I have great experiences with XML Serde for deep and nested structures and also with huge data sizes.

Regarding performance, please notice that the VTD (GPLv2 license) version (link below) can boost your performance:

https://github.com/dvasilen/Hive-XML-SerDe-VTD