Posts: 8
Registered: ‎05-31-2016

Impala maximum number of partitions

[ Edited ]



Wanted to know whether there's a best practice to the limit of partitions one Impala table can handle ?

If so, what's the penalties of going beyond that limit ?



Cloudera Employee
Posts: 307
Registered: ‎10-16-2013

Re: Impala maximum number of partitions



the Impala Cookbook describes best practices around schema design:


We recommend that a table should have at most 100k partitions and that is already stretching it. Keeiping it around 10k is definitely desirable.


Having too many partitions can have several negative effects, including:

- Slow metadata loading and distribution time. You may even hit some hard limits (e.g. in the JVM) which make dealing wich such a table very difficult or nearly impossible.

- Excessive memory usage on all daemons due to the large metadata which is cached.

- Slow partition pruning time for certain queries if the index-based pruning is not feasible (e.g. for complex partition-pruning predicates)


Best practices for using Impala, the open source analytic database for Apache Hadoop