Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

compression is not working in hive

compression is not working in hive

New Contributor

I created a hive table, use insert select to load existing impala data to hive table. I noticed 2 things.

 
1. The data size is more than twice the size of old data. Old data used impala to do the compression.
2. No matter how large I set parquet block size, hive always generate parquet files with similar file size.
 
I did this before inserting.

set hive.exec.dynamic.partition.mode=nonstrict;

SET parquet.column.index.access=true;

SET hive.merge.mapredfiles=true;

SET hive.exec.compress.output=true;

SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec;

SET mapred.output.compression.type=BLOCK;

SET dfs.block.size=445644800;

SET parquet.block.size=445644800;

Can anyone please point out what I did wrong? I'm using version Hive 1.1.0
Thank you!