Support Questions

Yukti · ‎01-13-2017

I was creating one table in hive using beeline in which i need to compress my data using PARQUET file format.

so i try to use set parquet.compression=SNAPPY;

But while executing this command i am getting one error as :

Error: Error while processing statement: Cannot modify parquet.compression at runtime. It is not in list of params that are allowed to be modified at runtime (state=42000,code=1)

I checked and this property is not present in whitelist of params and we dont have permissions to edit the whitelist.

so i got one resolution as instead of using set parquet.compression=SNAPPY; at runtime I used the table properties TBLPROPERTIES ('PARQUET.COMPRESS'='SNAPPY') and then it works the table is successfully created.

But when i loaded the data to table and by using describe table i compare the data with my other table in which i did not used the compression, the size of data is same.

so that means by using 'PARQUET.COMPRESS'='SNAPPY' compression is not happening.

Is there any other property which we need to set to get the compression done.

For Avro i have seen the below two properties to be set to do the compression

hive> set hive.exec.compress.output=true;

hive> set avro.output.codec=snappy;

Likewise do i need to set some other property for parquet file?

Yukti · ‎01-17-2017

It working now with 'PARQUET.COMPRESSION'='SNAPPY'

View solution in original post

srai1 · ‎01-13-2017

@Yukti Agrawal

How much data have you inserted to compare between the two tables ? Can you try it out with a substantially bigger data set ? Snappy is not very aggressive on reducing the size but rather on the compress/decompress operation.

Yukti · ‎01-17-2017

It working now with 'PARQUET.COMPRESSION'='SNAPPY'

Cloudera Community

Support Questions

Compression is not happening in hive using parquet file format.