Support Questions

Find answers, ask questions, and share your expertise

Verify the file's compression properties

avatar
Explorer

Hi,

 

How to verify the compression properties of a file produced by Sqoop import or any file which is compressed by snappy compression or any other compression codec file?

 

for example:

 

Let's say I have a file that is generated by the Sqoop import:

/orders/part-m-00000.avro

How to know which compression codec was used for this file "part-m-00000.avro"?

 

 

Thanks in advance!

 

1 REPLY 1

avatar

Hi, you can inspect the avro files with avro-tools utility.

 

create table work.test_avro ( i int, s string ) stored as avro;
insert into work.test_avro select 1, "abc";
set hive.exec.compress.output = true
set hive.exec.compress.intermediate = true;
set avro.output.codec= snappy;
insert into work.test_avro select 2, "abcdefgb";

In this table there are two file, one compressed with snappy one without compression, you can check it with get-meta command:

 

$ avro-tools getmeta 000000_0
avro.schema     {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]}
$ avro-tools getmeta 000000_0_copy_1
avro.schema     {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]}
avro.codec      snappy