Member since
05-29-2018
1
Post
0
Kudos Received
0
Solutions
06-13-2018
12:41 PM
I did some experiment on hive. It looks like no matther how much I put on set block size, hive always gave the same result on parquet file sizes. There are a lot small files. Here are the table properties. Can anyone help me? Thanks in advance! SET hive.exec.dynamic.partition.mode=nonstrict; SET parquet.column.index.access=true; SET hive.merge.mapredfiles=true; SET hive.exec.compress.output=true; SET mapred.output.compression.codec=org.apache.hadoop.io.compress.SnappyCodec; SET mapred.output.compression.type=BLOCK; SET parquet.compression=SNAPPY; SET dfs.block.size=445644800; SET parquet.block.size=445644800;
... View more
Labels:
- Labels:
-
Apache Hive
-
Apache Spark