Reply
Champion
Posts: 776
Registered: ‎05-16-2016

Re: Parquet table snappy compressed by default

If the tables are created STORED AS PARQUET in Hive will they be using Snappy codec or not ?

According to cloudera most of the CDH component that usess parquet file not compressed by default. 

 

for ORC format 

CREATE TABLE testingsnappy_orc
STORED AS ORC
TBLPROPERTIES("orc.compress"="snappy")
AS SELECT * FROM sourcetable;

for Parquet format 

 

same but add the

 

 
TBLPROPERTIES ( "orc.compress"="SNAPPY" );
 
Highlighted
Cloudera Employee
Posts: 117
Registered: ‎11-20-2015

Re: Parquet table snappy compressed by default

By default, in Hive, Parquet files are not written with compression enabled.

 

https://issues.apache.org/jira/browse/HIVE-11912

 

However, writing files with Impala into a Parquet table will create files with internal Snappy compression (by default).