09-25-2017 11:01 AM
If the tables are created STORED AS PARQUET in Hive will they be using Snappy codec or not ?
According to cloudera most of the CDH component that usess parquet file not compressed by default.
for ORC format
CREATE TABLE testingsnappy_orc STORED AS ORC TBLPROPERTIES("orc.compress"="snappy") AS SELECT * FROM sourcetable;
for Parquet format
same but add the
TBLPROPERTIES ( "orc.compress"="SNAPPY" );
05-10-2019 02:02 PM
By default, in Hive, Parquet files are not written with compression enabled.
However, writing files with Impala into a Parquet table will create files with internal Snappy compression (by default).