Created on 03-08-2017 07:59 AM - edited 09-16-2022 04:13 AM
Hi,
1) If we create a table (both hive and impala)and just specify stored as parquet . Will that be snappy compressed by default in CDH?
2) If not how do i identify a parquet table with snappy compression and parquet table without snappy compression?.
Also how to specify snappy compression for table level whiel creating and also at global level, even if nobody specified at table level (all table stored as parquet should be snappy compressed).
Please help
Created 09-25-2017 11:01 AM
If the tables are created STORED AS PARQUET in Hive will they be using Snappy codec or not ?
According to cloudera most of the CDH component that usess parquet file not compressed by default.
for ORC format
CREATE TABLE testingsnappy_orc STORED AS ORC TBLPROPERTIES("orc.compress"="snappy") AS SELECT * FROM sourcetable;
for Parquet format
same but add the
TBLPROPERTIES ( "orc.compress"="SNAPPY" );
Created 05-10-2019 02:02 PM
By default, in Hive, Parquet files are not written with compression enabled.
https://issues.apache.org/jira/browse/HIVE-11912
However, writing files with Impala into a Parquet table will create files with internal Snappy compression (by default).