- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Change default Hive compression codec
- Labels:
-
Apache Hive
-
Apache YARN
-
HDFS
Created on ‎05-08-2019 11:52 AM - edited ‎09-16-2022 07:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Cloudera Community ,
How i can change the compression codec of hive at runtime. I'm reading some table on avro format compressed with snappy and i'm triying to write a similiar table compressed on snappy but the result is compressed on "deflate", after try with multiple options the resulting files were compressed with the same codec.
Can you help me to identify my issue on the following sentences, or what can i do to define the compression codec of hive at runtime.
"set hive.exec.compress.output=true;
SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;
SET mapred.output.compression.type=BLOCK;
SET hive.exec.dynamic.partition.mode=nonstrict;
CREATE external table IF NOT EXISTS tableX partitioned by (year Int)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES ('avro.schema.url'='hdfs:///AAA/BBB/CCC/tableX.avsc');
alter table tableX add if not exists partition (year = 2016)
location 'hdfs://nameservice/AAA/BBB/CCC/2016';
insert overwrite table tableX partition (year = 2016) SELECT
id, name, email
FROM tablaY WHERE year = 2016;"
Best Regards,
Esteban
Created ‎05-09-2019 02:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"""
Hive
(…)
To enable Snappy compression on output [avro] files, run the following before writing to the table:
SET hive.exec.compress.output=true;
SET avro.output.codec=snappy;
"""
Please try this out. You're missing only the second property mentioned here, which appears specific to Avro serialization in Hive.
Default compression of Avro is deflate, so that explains the behaviour you observe without it.
Created ‎05-09-2019 02:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
"""
Hive
(…)
To enable Snappy compression on output [avro] files, run the following before writing to the table:
SET hive.exec.compress.output=true;
SET avro.output.codec=snappy;
"""
Please try this out. You're missing only the second property mentioned here, which appears specific to Avro serialization in Hive.
Default compression of Avro is deflate, so that explains the behaviour you observe without it.
