Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Verify the file's compression properties

avatar
Explorer

Hi,

 

How to verify the compression properties of a file produced by Sqoop import or any file which is compressed by snappy compression or any other compression codec file?

 

for example:

 

Let's say I have a file that is generated by the Sqoop import:

/orders/part-m-00000.avro

How to know which compression codec was used for this file "part-m-00000.avro"?

 

 

Thanks in advance!

 

1 REPLY 1

avatar

Hi, you can inspect the avro files with avro-tools utility.

 

create table work.test_avro ( i int, s string ) stored as avro;
insert into work.test_avro select 1, "abc";
set hive.exec.compress.output = true
set hive.exec.compress.intermediate = true;
set avro.output.codec= snappy;
insert into work.test_avro select 2, "abcdefgb";

In this table there are two file, one compressed with snappy one without compression, you can check it with get-meta command:

 

$ avro-tools getmeta 000000_0
avro.schema     {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]}
$ avro-tools getmeta 000000_0_copy_1
avro.schema     {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]}
avro.codec      snappy