<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Unable to access Avro object in HBase from Hive in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134957#M97616</link>
    <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/6713/markdoutre.html" nodeid="6713"&gt;@mark doutre&lt;/A&gt; I've just found a &lt;A href="http://blog.cloudera.com/blog/2016/05/how-to-improve-apache-hbase-performance-via-data-serialization-with-apache-avro/"&gt;new blog post&lt;/A&gt; talking about your use-case of storing Avro schema-less objects in HBase. It's implemented by direct interaction with HBase, without Hive. The code appears to be simple. HTH&lt;/P&gt;</description>
    <pubDate>Thu, 19 May 2016 11:10:39 GMT</pubDate>
    <dc:creator>pminovic</dc:creator>
    <dc:date>2016-05-19T11:10:39Z</dc:date>
    <item>
      <title>Unable to access Avro object in HBase from Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134952#M97611</link>
      <description>&lt;P&gt;I have a number of simple Avro objects stored in HBase and am trying to access them from Hive. I've set up a Hive table by following the instructions that I found &lt;A href="https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-AvroDataStoredinHBaseColumns.1"&gt;here.&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://cwiki.apache.org/confluence/display/Hive/HBaseIntegration#HBaseIntegration-AvroDataStoredinHBaseColumns.1"&gt;&lt;/A&gt;Basically in Hive I do:&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;DROP TABLE IF EXISTS HBaseAvro; &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;CREATE EXTERNAL TABLE HBaseAvro &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;ROW FORMAT SERDE 'org.apache.hadoop.hive.hbase.HBaseSerDe' &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;STORED BY 'org.apache.hadoop.hive.hbase.HBaseStorageHandler' &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;WITH SERDEPROPERTIES ("hbase.columns.mapping" = ":key,event:pCol",
                      "event.pCol.serialization.type" = "avro",
                      "event.pCol.avro.schema.url" = "hdfs:///tmp/kafka/avro/avro.avsc") &lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;TBLPROPERTIES("hbase.table.name" = "avro",
              "hbase.mapred.output.outputtable" = "avro",
              "hbase.struct.autogenerate" = "true");&lt;/P&gt;&lt;P&gt;If the Avro object contains the schema in the header, I have no problem and can access the data. However if the Avro object DOes NOT contain the schema then when I try and access the Avro object I get an IO Exception:&lt;/P&gt;&lt;P&gt;{"message":"H170 Unable to fetch results. java.io.IOException: org.apache.hadoop.hive.ql.metadata.HiveException: Error evaluating event_pcol...&lt;/P&gt;&lt;P&gt;If I do a DESCRIBE on the Hive table, I can see the table correctly, in that event_pcol is shown as a structure, with the correct fields. &lt;/P&gt;&lt;P&gt;I've tried moving the avsc file to check that the CREATE TABLE is working and Hive correctly complains. With the CREATE as above the table appears to be created correctly and I can access the "key" values, so the problem appears to be with the Avro object.&lt;/P&gt;&lt;P&gt;To me it looks like Hive is not using the schema definition passed in the schema.url parameter. I've tried including the schema as a schema.literal parameter and it still fails. &lt;/P&gt;&lt;P&gt;Any ideas?&lt;/P&gt;</description>
      <pubDate>Fri, 06 May 2016 18:10:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134952#M97611</guid>
      <dc:creator>markdoutre</dc:creator>
      <dc:date>2016-05-06T18:10:44Z</dc:date>
    </item>
    <item>
      <title>Re: Unable to access Avro object in HBase from Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134953#M97612</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/177/pminovic.html" nodeid="177"&gt;@Predrag Minovic&lt;/A&gt; avro and associated schema. &lt;/P&gt;&lt;P&gt;&lt;A href="https://community.cloudera.com/legacyfs/online/attachments/4139-avrobug.zip"&gt;avrobug.zip&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 10 May 2016 22:12:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134953#M97612</guid>
      <dc:creator>markdoutre</dc:creator>
      <dc:date>2016-05-10T22:12:22Z</dc:date>
    </item>
    <item>
      <title>Re: Unable to access Avro object in HBase from Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134954#M97613</link>
      <description>&lt;P&gt;Associated Hive code. Avro files are stored in /user/hue/testdata/avro_data/avro.avro etc&lt;/P&gt;&lt;PRE&gt;DROP TABLE IF EXISTS avro_test;
CREATE EXTERNAL TABLE avro_test
    COMMENT "A table backed by Avro data with the Avro schema stored in HDFS"
    ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
    STORED AS
    INPUTFORMAT  'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
    OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
    LOCATION 'hdfs:///user/hue/testdata/avro_data'
    TBLPROPERTIES (
        'avro.schema.url'='hdfs:///user/hue/testdata/avro.avsc'
    );

&lt;/PRE&gt;</description>
      <pubDate>Tue, 10 May 2016 22:16:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134954#M97613</guid>
      <dc:creator>markdoutre</dc:creator>
      <dc:date>2016-05-10T22:16:25Z</dc:date>
    </item>
    <item>
      <title>Re: Unable to access Avro object in HBase from Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134955#M97614</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/6713/markdoutre.html" nodeid="6713"&gt;@mark doutre&lt;/A&gt;, I checked your files (2 days ago, but couldn't post sooner), and my conclusion is that Hive cannot handle Avro files without schema. From &lt;A href="https://cwiki.apache.org/confluence/display/Hive/AvroSerDe"&gt;AvroSerDe&lt;/A&gt; page you can see which Avro versions are supported (1.5.3 to 1.7.5), and &lt;A href="https://avro.apache.org/docs/1.5.3/spec.html#Data+Serialization"&gt;Avro specs&lt;/A&gt; say: &lt;EM&gt;Avro data is always serialized with its schema.  Files that
	store Avro data should always also include the schema for that
	data in the same file. &lt;/EM&gt;And it has been so from version 1. So, it's very clear that "standard" Avro files must include schema and Hive supports only such files. With schema-less files you are on your own, you would have to read "value" from HBase and apply your schema to read the data and store such records in Hive. You can also include schema, which will work, but you will waste some space in HBase by storing the same schema in each record. Hope this helps.&lt;/P&gt;</description>
      <pubDate>Fri, 13 May 2016 19:39:38 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134955#M97614</guid>
      <dc:creator>pminovic</dc:creator>
      <dc:date>2016-05-13T19:39:38Z</dc:date>
    </item>
    <item>
      <title>Re: Unable to access Avro object in HBase from Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134956#M97615</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/177/pminovic.html" nodeid="177"&gt;@Predrag Minovic&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Thanks for taking the time to look in to this. I had sort of come to the same conclusion but all the info I had seen online seemed to suggest that Hive could access a schema-less Avro object provided that the schema was included via the TBLPROPERTIES avro.schema.url parameter. &lt;/P&gt;</description>
      <pubDate>Fri, 13 May 2016 21:54:02 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134956#M97615</guid>
      <dc:creator>markdoutre</dc:creator>
      <dc:date>2016-05-13T21:54:02Z</dc:date>
    </item>
    <item>
      <title>Re: Unable to access Avro object in HBase from Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134957#M97616</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/6713/markdoutre.html" nodeid="6713"&gt;@mark doutre&lt;/A&gt; I've just found a &lt;A href="http://blog.cloudera.com/blog/2016/05/how-to-improve-apache-hbase-performance-via-data-serialization-with-apache-avro/"&gt;new blog post&lt;/A&gt; talking about your use-case of storing Avro schema-less objects in HBase. It's implemented by direct interaction with HBase, without Hive. The code appears to be simple. HTH&lt;/P&gt;</description>
      <pubDate>Thu, 19 May 2016 11:10:39 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134957#M97616</guid>
      <dc:creator>pminovic</dc:creator>
      <dc:date>2016-05-19T11:10:39Z</dc:date>
    </item>
    <item>
      <title>Re: Unable to access Avro object in HBase from Hive</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134958#M97617</link>
      <description>&lt;P&gt;Any Update on this?&lt;/P&gt;&lt;P&gt;i am running into same exception, do we need to write Avro record with schema?&lt;/P&gt;&lt;P&gt;@&lt;A href="https://community.hortonworks.com/users/177/pminovic.html"&gt;mark doutre  @Predrag Minovic &lt;BR /&gt;&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/users/6713/markdoutre.html"&gt;&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 23 Aug 2018 03:47:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Unable-to-access-Avro-object-in-HBase-from-Hive/m-p/134958#M97617</guid>
      <dc:creator>yeshwanth43</dc:creator>
      <dc:date>2018-08-23T03:47:40Z</dc:date>
    </item>
  </channel>
</rss>

