Created 05-10-2016 07:51 AM
i was trying to create hive table for storing avro file and i have stored my avro shema(.avsc file),my avro file in single location.
could anyone help me to create the table in hive?
Created 05-10-2016 11:08 AM
If you already have your Avro file and Avro schema, upload them to HDFS and use
CREATE EXTERNAL TABLE my_avro_tbl ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe' STORED as INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat' OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat' LOCATION '/user/...' TBLPROPERTIES ('avro.schema.url'='hdfs://name-node.fqdn:8020/user/.../schema.avsc');
If your Avro file already contains the schema in its header you can just say
CREATE EXTERNAL TABLE tbl-name(... declarations ...) STORED AS AVRO LOCATION '...';
without specifying the schema. I have been testing it last few days and can confirm that it works on HDP-2.4 (Hive-1.2) for all scalar types like string, int, float, double, boolean etc. If you are using some complex types (like union) it might not work.
Created 05-10-2016 02:54 PM
The issue I am having is that I have avro objects that don't have the schema in the header. When I try and access these objects by specifying the schema via the schema.url parameter in the TBLPROPERTIES I am unable to access the data. However, if the Avro object includes the schema, I have no problem.
I'm pretty sure that the Avro objects are ok as I can extract the data from them using Avro-tools and providing the same schema. So what I would be interesting in seeing is if someone can load a schema-less avro object into HDFS then get a Hive table to access it, by providing the schema file.
Created 11-06-2017 12:02 PM
@Predrag Minovic could you please let me know how did you created hive avro table from hive text table? did you select from text table and created avro table using CREATE AS SELECT?
Created 05-10-2016 03:03 PM
Okay, please uplaod your files somewhere (one of your exisiting questions, or a new one), and I'll try to read them with Hive.
Created 12-08-2016 07:31 AM
When you say declarations above, you mean actually defining the columns and datatypes? What is the syntax for that? I am writing hdfs files from NiFi so it stores the schema header in the Avro file. I want to leverage that to create the hive table but do not have any good example to do that. If you could elaboratethe declarations part with some examples of a few columns it would really help.