Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Maximum Hive Table Partitions allowed & recommended

avatar
Expert Contributor

What is the maximum number of partitions allowed for a Hive table? E.g. 2k ... 10k?

Are there any performance implications we should consider as we get close to this number?

1 ACCEPTED SOLUTION

avatar

Wes, current Hive versions with RDBMS metastore backend should be able to handle 10 000+ partitions.

For numerous reasons, the community is moving away from this design to leverage HBase for the metastore. Follow https://issues.apache.org/jira/browse/HIVE-9452 . Overall design document is available here: https://issues.apache.org/jira/secure/attachment/12697601/HBaseMetastoreApproach.pdf

View solution in original post

7 REPLIES 7

avatar

Wes, current Hive versions with RDBMS metastore backend should be able to handle 10 000+ partitions.

For numerous reasons, the community is moving away from this design to leverage HBase for the metastore. Follow https://issues.apache.org/jira/browse/HIVE-9452 . Overall design document is available here: https://issues.apache.org/jira/secure/attachment/12697601/HBaseMetastoreApproach.pdf

avatar
Master Mentor

@Andrew Grande Thanks for sharing the HBASE approach. Nice!!!

avatar
Master Mentor

What database are you using for Metastore? @Wes Floyd

avatar
Super Collaborator

When working with a table of 1000 partitions and having the Hive concurrency enabled, I once ran into some problems. I don't know if it is still an issue (the problem appeared last year with Hive 0.13) but I think it can be worth mentioning it here:

http://mail-archives.apache.org/mod_mbox/hive-user/201408.mbox/%3CCAENxBwxmjN7VTJuzq1G4FimoFYkwZsWJJ...

avatar
New Member

The performance implications mostly come at read time. If you have queries that read many (>2k) partitions you will see long (30+ sec) times to plan queries. As Andrew mentioned, the work on the HBase metastore should improve this.

avatar
Master Mentor

Thanks @gates@hortonworks.com for chimmig in .

avatar
Rising Star

what if I only open less than 50 partitions out of 1M at any given time??