Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Parquet table snappy compressed by default

Re: Parquet table snappy compressed by default

Champion

If the tables are created STORED AS PARQUET in Hive will they be using Snappy codec or not ?

According to cloudera most of the CDH component that usess parquet file not compressed by default. 

 

for ORC format 

CREATE TABLE testingsnappy_orc
STORED AS ORC
TBLPROPERTIES("orc.compress"="snappy")
AS SELECT * FROM sourcetable;

for Parquet format 

 

same but add the

 

 
TBLPROPERTIES ( "orc.compress"="SNAPPY" );
 
Highlighted

Re: Parquet table snappy compressed by default

Contributor

By default, in Hive, Parquet files are not written with compression enabled.

 

https://issues.apache.org/jira/browse/HIVE-11912

 

However, writing files with Impala into a Parquet table will create files with internal Snappy compression (by default).