Support Questions
Find answers, ask questions, and share your expertise

what is the best way to load xml data into hive

New Contributor
 
2 REPLIES 2

Rising Star

@priy shankar

The easiest way is to use the Hive XML SerDe (com.ibm.spss.hive.serde2.xml.XmlSerDe), which will allow you to directly import and work with XML data.

Please see the following links for the steps to get this working:

https://community.hortonworks.com/content/kbentry/972/hive-and-xml-pasring.html

https://community.hortonworks.com/questions/40979/hive-xml-parising-null-value-returned.html

New Contributor

You can automate the whole process of generating ORC/Parquet for Hive in a relational structure. This blog post shows how to convert MISMO XML to Hive and Parquet