Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

what is the best way to load xml data into hive

what is the best way to load xml data into hive

New Contributor
 
2 REPLIES 2
Highlighted

Re: what is the best way to load xml data into hive

Rising Star

@priy shankar

The easiest way is to use the Hive XML SerDe (com.ibm.spss.hive.serde2.xml.XmlSerDe), which will allow you to directly import and work with XML data.

Please see the following links for the steps to get this working:

https://community.hortonworks.com/content/kbentry/972/hive-and-xml-pasring.html

https://community.hortonworks.com/questions/40979/hive-xml-parising-null-value-returned.html

Re: what is the best way to load xml data into hive

New Contributor

You can automate the whole process of generating ORC/Parquet for Hive in a relational structure. This blog post shows how to convert MISMO XML to Hive and Parquet

Don't have an account?
Coming from Hortonworks? Activate your account here