28790
DISCUSSIONS
102122
MEMBERS
3161
ARTICLES
Created 11-13-2013 06:53 PM
Impala statement
INSERT INTO <parquet_table> PARTITION(...) SELECT * FROM <avro_table>
creates many ~350 MB parquet files in every partition.
"Parquet data files use a 1GB block size, so when deciding how finely to partition the data, try to find a granularity where each partition contains 1GB or more of data, rather than creating a large number of smaller files split among many partitions."
I use impalad version 1.1.1 RELEASE (build 83d5868f005966883a918a819a449f636a5b3d5f)
How to increase parquet file size?
Thanks,
Alex