Reply
Highlighted
New Contributor
Posts: 3
Registered: ‎10-30-2018
Accepted Solution

Change default Hive compression codec

Hi Cloudera Community , 

 

How i can change  the compression codec of hive at runtime. I'm reading some table on avro format compressed with snappy and i'm triying to write a similiar table compressed on snappy but the result is compressed on "deflate", after try with multiple options the resulting files were compressed with the same codec. 

 

Can you help me to identify my issue on the following sentences, or what can i do to define the compression codec of hive at runtime.

 


"set hive.exec.compress.output=true;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;
SET hive.exec.dynamic.partition.mode=nonstrict;

 

 

CREATE external table IF NOT EXISTS tableX partitioned by (year Int)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES ('avro.schema.url'='hdfs:///AAA/BBB/CCC/tableX.avsc');

 

 

alter table tableX add if not exists partition (year = 2016)
location 'hdfs://nameservice/AAA/BBB/CCC/2016';

insert overwrite table tableX partition (year = 2016) SELECT
id, name, email
FROM tablaY WHERE year = 2016;"

 

Best Regards, 

 

Esteban

Posts: 1,893
Kudos: 432
Solutions: 302
Registered: ‎07-31-2013

Re: Change default Hive compression codec

Quoted from documentation about using Avro files at https://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_avro_usage.html#topic_26_2

"""
Hive
(…)
To enable Snappy compression on output [avro] files, run the following before writing to the table:

SET hive.exec.compress.output=true;
SET avro.output.codec=snappy;
"""

Please try this out. You're missing only the second property mentioned here, which appears specific to Avro serialization in Hive.

Default compression of Avro is deflate, so that explains the behaviour you observe without it.
Announcements