Support Questions

Find answers, ask questions, and share your expertise

Create Hive ORC table from avro file

avatar
Explorer

Hello,

Is it possible to create an internal ORC hive table from an avro hdfs file?

I tried something like this:

CREATE TABLE orc_table
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED as orc
LOCATION '/user/someuser/avro_folder/'
TBLPROPERTIES ('avro.schema.url'='hdfs:///user/someuser/schema.avsc');

but I get this error:

Failed with exception java.io.IOException:java.lang.RuntimeException: serious problem

Beside a describe formatted command returns:

# Storage Information
SerDe Library:          org.apache.hadoop.hive.serde2.avro.AvroSerDe
InputFormat:            org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
OutputFormat:           org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat

Thanks in advance.

1 ACCEPTED SOLUTION

avatar

@younes kafi

The create table is incorrect, it is not possible to store the data in avro format into orc format. The best possible way is as below:

CREATE TABLE avro_table
ROW FORMAT SERDE'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
LOCATION '/user/someuser/avro_folder/'
TBLPROPERTIES ('avro.schema.url'='hdfs:///user/someuser/schema.avsc');

Create orc file as below:
CREATE TABLE orc_table stored as orc;
INSERT INTO TABLE orc_table SELECT * FROM avro_table;

View solution in original post

1 REPLY 1

avatar

@younes kafi

The create table is incorrect, it is not possible to store the data in avro format into orc format. The best possible way is as below:

CREATE TABLE avro_table
ROW FORMAT SERDE'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
LOCATION '/user/someuser/avro_folder/'
TBLPROPERTIES ('avro.schema.url'='hdfs:///user/someuser/schema.avsc');

Create orc file as below:
CREATE TABLE orc_table stored as orc;
INSERT INTO TABLE orc_table SELECT * FROM avro_table;