Support Questions

Find answers, ask questions, and share your expertise

How to create hive ddl on top xml using the xml serde

avatar
Contributor

Hi I have an xml with below format,Can anyone help me how to create a hive DDL on top of this xml.

<code><root><root1><id>4545482361</id>`enter code here`
<joiningdate>1/3/2010</joiningdate><Segments><Segmentxse:type="manager"><cityworked>Hyd</cityworked><reports>john</reports><salary>150000</salary><datestarted>1/3/2012</datestarted></Segment><Segmentxse:type="manager"><cityworked>Hyd</cityworked><reports>mike</reports><salary>225000</salary><datestarted>1/9/2014</datestarted></Segment><Segmentxse:type="VP"><cityworked>mumbai</cityworked><datestarted>1/9/2014</datestarted><subemployees><Fname>ram</Fname><Lname>Achanta</Lname><Desgination>Director of IT</Desgination></subemployees></Segment><Segmentxse:type="SVP"><Staus>currentposition</status><numberofemployees>10</numberofemployees></Segment></Segments></root1></root>
2 REPLIES 2

avatar
Expert Contributor
@srini

You can import xml directly using com.ibm.spss.hive.serde2.xml.XmlSerDe which is detailed here:

https://community.hortonworks.com/articles/972/hive-and-xml-pasring.html

Other option is to load entire record in a string and then access it using xpath UDF :

1. create table employee( employee_info string) ;

2. load data local inpath '/home/hduser/sample.xml' into table employee;

3. create view employee_xml_view as SELECT xpath_int(employee_info ,'code/root/root1/id'),xpath_string(employee_info ,'code/root/root1/joiningdate'),xpath_string(employee_info ,'code/root/root1/joiningdate').................. from employee;

4. Select * from employee_xml_view;

avatar
Contributor

Hi Trivedi ,Thanks for your reply.Can you please answer this in details because i was able to create ddl on other part of xml which consists of array and structs,but if you see the xml i attached segments is a tag which contains segment within it and xse:type is manager,vp,svp. Can you please help in creating the exact schema for this