- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Verify the file's compression properties
- Labels:
-
Apache Sqoop
Created on ‎08-23-2018 09:12 AM - edited ‎09-16-2022 06:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
How to verify the compression properties of a file produced by Sqoop import or any file which is compressed by snappy compression or any other compression codec file?
for example:
Let's say I have a file that is generated by the Sqoop import:
/orders/part-m-00000.avro
How to know which compression codec was used for this file "part-m-00000.avro"?
Thanks in advance!
Created ‎08-24-2018 01:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, you can inspect the avro files with avro-tools utility.
create table work.test_avro ( i int, s string ) stored as avro; insert into work.test_avro select 1, "abc"; set hive.exec.compress.output = true set hive.exec.compress.intermediate = true; set avro.output.codec= snappy; insert into work.test_avro select 2, "abcdefgb";
In this table there are two file, one compressed with snappy one without compression, you can check it with get-meta command:
$ avro-tools getmeta 000000_0 avro.schema {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]} $ avro-tools getmeta 000000_0_copy_1 avro.schema {"type":"record","name":"test_avro","namespace":"work","fields":[{"name":"i","type":["null","int"],"default":null},{"name":"s","type":["null","string"],"default":null}]} avro.codec snappy
