Member since
03-29-2020
1
Post
0
Kudos Received
0
Solutions
03-29-2020
03:13 AM
For below sample xml file: <Company> <Employee> <Id>458790</Id> <Name>Sameer</Name> <Email>sameer@email.com</Email> <Address> <HouseNo>105</HouseNo> <Street>Grand Road</Street> <City>Bangalore</City> <State>Karnataka</State> <Pincode>560068</Pincode> <Country>India</Country> <Passport>Available</Passport> <Visa/> <Contact> <Mobile>9909999999</Mobile> <Phone>8044552266</Phone> </Contact> </Address> </Employee> <Employee> <Id>458791</Id> <Name>Gohar</Name> <Email>Gohar@email.com</Email> <Address> <HouseNo>485</HouseNo> <Street>Camac Street Road</Street> <City>Mumbai</City> <State>Maharastra</State> <Pincode>400001</Pincode> <Country>India</Country> <Passport>Available</Passport> <Visa/> <Contact> <Mobile>9908888888</Mobile> <Phone/> </Contact> </Address> </Employee> </Company> Executed this query for a table containing three columns id name email Getting all employees information from xml into different rows like: 458790 Sameer sameer@email.com 458791 Gohar Gohar@email.com by executing below query CREATE EXTERNAL TABLE Company (Id STRING, Name String,Email String) ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe' WITH SERDEPROPERTIES ( "column.xpath.Id"="Employee/Id/text()", "column.xpath.Name"="Employee/Name/text()", "column.xpath.Email"="Employee/Email/text()") STORED AS INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat' LOCATION '/user/add/hive/xml' TBLPROPERTIES ("xmlinput.start"="<Company","xmlinput.end"= "</Company>") but when I am storing it as irc or textfile it is giving null values like stored as textfile then NULL NULL NULL how to resolve it, Thanks
... View more
Labels:
- Labels:
-
Apache Hive
-
HDFS