Support Questions
Find answers, ask questions, and share your expertise

hive set block size not working

hive set block size not working

New Contributor

I did some experiment on hive. It looks like no matther how much I put on set block size, hive always gave the same result on parquet file sizes. There are a lot small files. Here are the table properties. Can anyone help me? Thanks in advance!


SET hive.exec.dynamic.partition.mode=nonstrict;

SET parquet.column.index.access=true;

SET hive.merge.mapredfiles=true;

SET hive.exec.compress.output=true;


SET mapred.output.compression.type=BLOCK;

SET parquet.compression=SNAPPY;

SET dfs.block.size=445644800;

SET parquet.block.size=445644800;


Re: hive set block size not working




You have mentioned there are lot of small files. And you set the block.size as 445644800 (which is 445 MB approx)


If your block.size > small file  then  you will not find any difference


Ex: All the below will give the same result

445 MB > 1 MB 

400 MB > 1 MB

300 MB > 1 MB

200 MB > 1 MB

100 MB > 1 MB

10 MB > 1 MB

2 MB > 1 MB


may be you will find difference in file size when you set the block.size < small file




Re: hive set block size not working

New Contributor

how exactly do you increase the file size created by the hive job then?