Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

how to export xml data saved in a hive table's string field to other hive table created as SERDE ,how to export xml data saved in a ms-sql field (as single string) into hive table created with SERDE

Solved Go to solution

how to export xml data saved in a hive table's string field to other hive table created as SERDE ,how to export xml data saved in a ms-sql field (as single string) into hive table created with SERDE

New Contributor

1) below hive table "books_xml" contains only one field named "xmldata" in which I have saved xml data as a single string.

hive> select xmldata from books_xml;
OK
<?xml version="1.0" encoding="UTF-8"?><catalog><book>  <id>11</id> 
<genre>Computer</genre>  <price>44</price></book><book>  <id>44</id> 
<genre>Fantasy</genre>  <price>5</price></book></catalog>
Time taken: 0.175 seconds, Fetched: 1 row(s)
2) I want to export above xml data (which is in field) into below hive table created with SERDE properties?

CREATE TABLE books_serde (
   id STRING, 
   genre STRING, 
   price DOUBLE
)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
"column.xpath.id"="/book/id/text()",
"column.xpath.genre"="/book/genre/text()",
"column.xpath.price"="/book/price/text()"
)
STORED AS
INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
TBLPROPERTIES (
"xmlinput.start"="<book>",
"xmlinput.end"="</book>"
);
,
1) below hive table "books_xml" contains only one field named "xmldata String" in which I have saved xml data as a single string.
hive> select xmldata from books_xml;
OK
<?xml version="1.0" encoding="UTF-8"?><catalog><book>  <id>11</id> 
<genre>Computer</genre>  <price>44</price></book><book>  <id>44</id> 
<genre>Fantasy</genre>  <price>5</price></book></catalog>
Time taken: 0.175 seconds, Fetched: 1 row(s)
2) I want to export above xml data (which is in field) into below hive table created with SERDE properties?
CREATE TABLE books_serde (
   id STRING, 
   genre STRING, 
   price DOUBLE
)
ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
WITH SERDEPROPERTIES (
"column.xpath.id"="/book/id/text()",
"column.xpath.genre"="/book/genre/text()",
"column.xpath.price"="/book/price/text()"
)
STORED AS
INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
TBLPROPERTIES (
"xmlinput.start"="<book>",
"xmlinput.end"="</book>"
);
1 ACCEPTED SOLUTION

Accepted Solutions

Re: how to export xml data saved in a hive table's string field to other hive table created as SERDE ,how to export xml data saved in a ms-sql field (as single string) into hive table created with SERDE

Rising Star

@Riaz Lala

Okay, I have got two options for you that you can try:

Option#11) Using xpath builtin function:

You might have to do some data transformations to your input file to break down the book nodes to individual records. My input data looks like below:

<catalog><book><id>11</id><genre>Computer</genre><price>44</price></book></catalog> 

<catalog><book><id>45</id><genre>Fantasy</genre><price>5</price></book></catalog>

hive> select * from books_xml; 

OK 

<catalog><book><id>11</id><genre>Computer</genre><price>44</price></book></catalog> 

<catalog><book><id>45</id><genre>Fantasy</genre><price>5</price></book></catalog>
Time taken: 0.483 seconds, Fetched: 2 row(s)

Then you can use xpath function like below to access individual columns:

hive> select xpath_int(xmldata, '/catalog/book/id'), xpath_string(xmldata, '/catalog/book/genre') as genre, xpath_int(xmldata, '/catalog/book/price') from books_xml;
OK
11	Computer	44
45	Fantasy		5
Time taken: 0.508 seconds, Fetched: 2 row(s)

You can do insert overwrite to load into any other external table you want with regular column names

Option#2) Use of external UDF function that parses the xml data into an array and use explode function to select the indivudual values and load them to any target table you have.

Let me know if this helps

6 REPLIES 6

Re: how to export xml data saved in a hive table's string field to other hive table created as SERDE ,how to export xml data saved in a ms-sql field (as single string) into hive table created with SERDE

Rising Star

@Riaz Lala

Have you looked at the post

Let me know if this is helpful

Re: how to export xml data saved in a hive table's string field to other hive table created as SERDE ,how to export xml data saved in a ms-sql field (as single string) into hive table created with SERDE

New Contributor

Thanks Hari, yes I have seen that post which is bit different in scenario. Post explains loading xml file "sample.xml" into hive serde table. In my case I have xml stored in a hive table field 'xmldata' and i want to use some sort of 'insert into table_serde select xmlfield from XmlTable'. I mean xml data moving from hive table to hive table.

Other way for me is that I can add location parameter in serde table pointing to location where single field hive table has stored its data but this has draw back if I add other fields in single field hive table it will invalidate serde table.

Thanks,

Riaz Lala

Re: how to export xml data saved in a hive table's string field to other hive table created as SERDE ,how to export xml data saved in a ms-sql field (as single string) into hive table created with SERDE

Rising Star

@Riaz Lala

Okay, I have got two options for you that you can try:

Option#11) Using xpath builtin function:

You might have to do some data transformations to your input file to break down the book nodes to individual records. My input data looks like below:

<catalog><book><id>11</id><genre>Computer</genre><price>44</price></book></catalog> 

<catalog><book><id>45</id><genre>Fantasy</genre><price>5</price></book></catalog>

hive> select * from books_xml; 

OK 

<catalog><book><id>11</id><genre>Computer</genre><price>44</price></book></catalog> 

<catalog><book><id>45</id><genre>Fantasy</genre><price>5</price></book></catalog>
Time taken: 0.483 seconds, Fetched: 2 row(s)

Then you can use xpath function like below to access individual columns:

hive> select xpath_int(xmldata, '/catalog/book/id'), xpath_string(xmldata, '/catalog/book/genre') as genre, xpath_int(xmldata, '/catalog/book/price') from books_xml;
OK
11	Computer	44
45	Fantasy		5
Time taken: 0.508 seconds, Fetched: 2 row(s)

You can do insert overwrite to load into any other external table you want with regular column names

Option#2) Use of external UDF function that parses the xml data into an array and use explode function to select the indivudual values and load them to any target table you have.

Let me know if this helps

Re: how to export xml data saved in a hive table's string field to other hive table created as SERDE ,how to export xml data saved in a ms-sql field (as single string) into hive table created with SERDE

New Contributor

Thank you very much Hari. Option#1 is perfect in my case.

Highlighted

Re: how to export xml data saved in a hive table's string field to other hive table created as SERDE ,how to export xml data saved in a ms-sql field (as single string) into hive table created with SERDE

Rising Star

@Riaz Lala

Glad that it was helpful. Can you upvote and accept the answer please?

Thanks

Re: how to export xml data saved in a hive table's string field to other hive table created as SERDE ,how to export xml data saved in a ms-sql field (as single string) into hive table created with SERDE

New Contributor

@Riaz Lala

Have you find solution to your post? If yes, Can you please pose your approach please. I am struggling with the same situation. Help needed. Thanks!

Don't have an account?
Coming from Hortonworks? Activate your account here