Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Impala maximum number of partitions

Impala maximum number of partitions

Explorer

Hi,

 

Wanted to know whether there's a best practice to the limit of partitions one Impala table can handle ?

If so, what's the penalties of going beyond that limit ?

 

Thanks

1 REPLY 1

Re: Impala maximum number of partitions

Master Collaborator

Hi!

 

the Impala Cookbook describes best practices around schema design:

http://www.slideshare.net/cloudera/the-impala-cookbook-42530186

 

We recommend that a table should have at most 100k partitions and that is already stretching it. Keeiping it around 10k is definitely desirable.

 

Having too many partitions can have several negative effects, including:

- Slow metadata loading and distribution time. You may even hit some hard limits (e.g. in the JVM) which make dealing wich such a table very difficult or nearly impossible.

- Excessive memory usage on all daemons due to the large metadata which is cached.

- Slow partition pruning time for certain queries if the index-based pruning is not feasible (e.g. for complex partition-pruning predicates)