<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Hive XML Parising -  Null value returned in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-XML-Parising-Null-value-returned/m-p/149879#M32565</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/177/pminovic.html" nodeid="177"&gt;@Predrag Minovic&lt;/A&gt;  Thanks for the answer. impressive. I was assuming the root node (catalog) needs to be mentioned in the xmlinput.start and xmlinput.end, so that all the nodes in between the root nodes can be queried using the xpath. Thanks for the clarification. &lt;/P&gt;</description>
    <pubDate>Tue, 28 Jun 2016 21:55:50 GMT</pubDate>
    <dc:creator>manikandan_rama</dc:creator>
    <dc:date>2016-06-28T21:55:50Z</dc:date>
    <item>
      <title>Hive XML Parising -  Null value returned</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-XML-Parising-Null-value-returned/m-p/149877#M32563</link>
      <description>&lt;P&gt;Hi All,&lt;/P&gt;&lt;P&gt;Tried out a sample xml parsing using the Serde. but it returns a null value. &lt;/P&gt;&lt;PRE&gt;hive&amp;gt; DROP TABLE BOOKDATA;
OK
Time taken: 0.486 seconds
hive&amp;gt;
  &amp;gt; CREATE EXTERNAL TABLE BOOKDATA(
  &amp;gt; TITLE VARCHAR(40),
  &amp;gt; PRICE  INT
  &amp;gt; )ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
  &amp;gt; WITH SERDEPROPERTIES (
  &amp;gt; "column.xpath.TITLE"="/CATALOG/BOOK/TITLE/",
  &amp;gt; "column.xpath.PRICE"="/CATALOG/BOOK/PRICE/")
  &amp;gt; STORED AS
  &amp;gt; INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
  &amp;gt; OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
  &amp;gt; LOCATION '/sourcedata'
  &amp;gt; TBLPROPERTIES (
  &amp;gt; "xmlinput.start"="&amp;lt;CATALOG",
  &amp;gt; "xmlinput.end"= "&amp;lt;/CATALOG&amp;gt;"
  &amp;gt; );
OK
Time taken: 0.285 seconds&lt;/PRE&gt;
&lt;PRE&gt;hive&amp;gt; select * from BOOKDATA;
OK
NULL  NULL
Time taken: 0.184 seconds, Fetched: 1 row(s)
hive&amp;gt;&lt;/PRE&gt;
&lt;PRE&gt;~]$ hadoop fs -cat /sourcedata/bookdata.xml
&amp;lt;CATALOG&amp;gt;
&amp;lt;BOOK&amp;gt;
&amp;lt;TITLE&amp;gt;Hadoop Defnitive Guide&amp;lt;/TITLE&amp;gt;
&amp;lt;AUTHOR&amp;gt;Tom White&amp;lt;/AUTHOR&amp;gt;
&amp;lt;COUNTRY&amp;gt;US&amp;lt;/COUNTRY&amp;gt;
&amp;lt;COMPANY&amp;gt;CLOUDERA&amp;lt;/COMPANY&amp;gt;
&amp;lt;PRICE&amp;gt;24.90&amp;lt;/PRICE&amp;gt;
&amp;lt;YEAR&amp;gt;2012&amp;lt;/YEAR&amp;gt;
&amp;lt;/BOOK&amp;gt;
&amp;lt;BOOK&amp;gt;
&amp;lt;TITLE&amp;gt;Programming Pig&amp;lt;/TITLE&amp;gt;
&amp;lt;AUTHOR&amp;gt;Alan Gates&amp;lt;/AUTHOR&amp;gt;
&amp;lt;COUNTRY&amp;gt;USA&amp;lt;/COUNTRY&amp;gt;
&amp;lt;COMPANY&amp;gt;Horton Works&amp;lt;/COMPANY&amp;gt;
&amp;lt;PRICE&amp;gt;30.90&amp;lt;/PRICE&amp;gt;
&amp;lt;YEAR&amp;gt;2013&amp;lt;/YEAR&amp;gt;
&amp;lt;/BOOK&amp;gt;
&amp;lt;/CATALOG&amp;gt;&lt;/PRE&gt;</description>
      <pubDate>Tue, 21 Jun 2016 22:35:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-XML-Parising-Null-value-returned/m-p/149877#M32563</guid>
      <dc:creator>manikandan_rama</dc:creator>
      <dc:date>2016-06-21T22:35:14Z</dc:date>
    </item>
    <item>
      <title>Re: Hive XML Parising -  Null value returned</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-XML-Parising-Null-value-returned/m-p/149878#M32564</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/10453/manikandanramaraj.html" nodeid="10453"&gt;@elan chelian&lt;/A&gt;. It works with following changes (details &lt;A href="https://github.com/dvasilen/Hive-XML-SerDe/wiki/XML-data-sources"&gt;here&lt;/A&gt;&lt;span class="lia-unicode-emoji" title=":disappointed_face:"&gt;😞&lt;/span&gt;&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;TITLE must be STRING, it seems XmlSerDe doesn't support VARCHAR yet&lt;/LI&gt;&lt;LI&gt;PRICE must be declared as FLOAT or DOUBLE, not INT (e.g., 24.90)&lt;/LI&gt;&lt;LI&gt;Your unit record of data is BOOK, not CATALOG&lt;/LI&gt;&lt;LI&gt;You are missing text() to capture specific values&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;Declarations:&lt;/P&gt;&lt;PRE&gt;DROP TABLE IF EXISTS BOOKDATA;
CREATE EXTERNAL TABLE BOOKDATA (TITLE STRING, PRICE FLOAT)
 ROW FORMAT SERDE 'com.ibm.spss.hive.serde2.xml.XmlSerDe'
 WITH SERDEPROPERTIES (
 "column.xpath.TITLE"="/BOOK/TITLE/text()",
 "column.xpath.PRICE"="/BOOK/PRICE/text()")
 STORED AS INPUTFORMAT 'com.ibm.spss.hive.serde2.xml.XmlInputFormat'
 OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.IgnoreKeyTextOutputFormat'
 LOCATION '/user/it1/hive/xml'
 TBLPROPERTIES ("xmlinput.start"="&amp;lt;BOOK","xmlinput.end"= "&amp;lt;/BOOK&amp;gt;");
&lt;/PRE&gt;&lt;P&gt;Test:&lt;/P&gt;&lt;PRE&gt;hive&amp;gt; select * from BOOKDATA;
OK
Hadoop Defnitive Guide	24.9
Programming Pig	30.9
Time taken: 0.081 seconds, Fetched: 2 row(s)&lt;/PRE&gt;</description>
      <pubDate>Sun, 26 Jun 2016 11:31:30 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-XML-Parising-Null-value-returned/m-p/149878#M32564</guid>
      <dc:creator>pminovic</dc:creator>
      <dc:date>2016-06-26T11:31:30Z</dc:date>
    </item>
    <item>
      <title>Re: Hive XML Parising -  Null value returned</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-XML-Parising-Null-value-returned/m-p/149879#M32565</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/177/pminovic.html" nodeid="177"&gt;@Predrag Minovic&lt;/A&gt;  Thanks for the answer. impressive. I was assuming the root node (catalog) needs to be mentioned in the xmlinput.start and xmlinput.end, so that all the nodes in between the root nodes can be queried using the xpath. Thanks for the clarification. &lt;/P&gt;</description>
      <pubDate>Tue, 28 Jun 2016 21:55:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-XML-Parising-Null-value-returned/m-p/149879#M32565</guid>
      <dc:creator>manikandan_rama</dc:creator>
      <dc:date>2016-06-28T21:55:50Z</dc:date>
    </item>
    <item>
      <title>Re: Hive XML Parising -  Null value returned</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-XML-Parising-Null-value-returned/m-p/294311#M32566</link>
      <description>&lt;P&gt;Hello Sir&lt;/P&gt;&lt;P&gt;I got the output as below but I am not getting any data do you know why?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;hive&amp;gt; select * from BOOKDATA;
OK
Hadoop Defnitive Guide	24.9
Programming Pig	30.9
Time taken: 0.081 seconds, Fetched: 2 row(s)&lt;/PRE&gt;</description>
      <pubDate>Sun, 19 Apr 2020 23:41:28 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Hive-XML-Parising-Null-value-returned/m-p/294311#M32566</guid>
      <dc:creator>Ism</dc:creator>
      <dc:date>2020-04-19T23:41:28Z</dc:date>
    </item>
  </channel>
</rss>

