Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

creation of hive table in ORC format from avro files

creation of hive table in ORC format from avro files

Explorer

I have staging area and data loaded in staging is in avro format . can I create ORC file from avro file format directly without creating avro table in hive ? as data in already in the binary format in avro , is it possible to directly create ORC table rather than first creating DDL in hive in avro format and later insert data in ORC table from avro ?

1 REPLY 1
Highlighted

Re: creation of hive table in ORC format from avro files

Cloudera Employee

Hello, 

 

1. ORC files can be created from Avro but not directly. This can be done in two steps.

a. Convert the Avro into json format using avro-tools jar on command line.

b. Convert the json file into ORC using orc-tools jar. (introduced from ORC v1.4) [See: https://orc.apache.org/news/2017/05/08/ORC-1.4.0/]

 

2. Through Hive tables - Yes, we can accomplish this by creating a new table with ORC storage format and inserting data from the table which has the data in Avro format. [table2 in the below example stores the data in ORC format and table1 in Avro]

 

CREATE TABLE test2
(col1 string,
col2 string)
STORED AS ORC;

INSERT INTO test2
select * from test1;

 

Thanks!

 

 

Don't have an account?
Coming from Hortonworks? Activate your account here