Created 03-12-2018 03:04 PM
Hi I have an xml with below format,Can anyone help me how to create a hive DDL on top of this xml.
<code><root><root1><id>4545482361</id>`enter code here` <joiningdate>1/3/2010</joiningdate><Segments><Segmentxse:type="manager"><cityworked>Hyd</cityworked><reports>john</reports><salary>150000</salary><datestarted>1/3/2012</datestarted></Segment><Segmentxse:type="manager"><cityworked>Hyd</cityworked><reports>mike</reports><salary>225000</salary><datestarted>1/9/2014</datestarted></Segment><Segmentxse:type="VP"><cityworked>mumbai</cityworked><datestarted>1/9/2014</datestarted><subemployees><Fname>ram</Fname><Lname>Achanta</Lname><Desgination>Director of IT</Desgination></subemployees></Segment><Segmentxse:type="SVP"><Staus>currentposition</status><numberofemployees>10</numberofemployees></Segment></Segments></root1></root>
Created 03-12-2018 04:35 PM
You can import xml directly using com.ibm.spss.hive.serde2.xml.XmlSerDe which is detailed here:
https://community.hortonworks.com/articles/972/hive-and-xml-pasring.html
Other option is to load entire record in a string and then access it using xpath UDF :
1. create table employee( employee_info string) ;
2. load data local inpath '/home/hduser/sample.xml' into table employee;
3. create view employee_xml_view as SELECT xpath_int(employee_info ,'code/root/root1/id'),xpath_string(employee_info ,'code/root/root1/joiningdate'),xpath_string(employee_info ,'code/root/root1/joiningdate').................. from employee;
4. Select * from employee_xml_view;
Created 03-12-2018 05:50 PM
Hi Trivedi ,Thanks for your reply.Can you please answer this in details because i was able to create ddl on other part of xml which consists of array and structs,but if you see the xml i attached segments is a tag which contains segment within it and xse:type is manager,vp,svp. Can you please help in creating the exact schema for this