- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Max number of databases and tables allowed in Hive Metastore
- Labels:
-
Apache Hive
Created ‎08-29-2022 01:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Crossposting from: Stack Overflow
(I will respond and give updates on both places)
I would like to know if there is also maximum number for items below:
- Maximum number of databases in a catalog (I assume Hive Metastore only has one catalog, which is "hive")
- Maximum number of tables per database (as in, can I create 10 million tables in a database or due to limitation must I split them into 10 databases each with 1 million tables)
I also would like to know whether the limitations are hard limit (unconfigurable), or configurable by Hive, or dependent on RDBMS it is using.
Created ‎08-29-2022 03:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
There's no limit about the number of databases in Hive metastore. As a good practice we do not recommend to create tables with more than 10,0000 partitions.
In my opinion, you shouldn't have problem with 2,000 tables. You can expect to have some type of performance issues if you have a total number object greater than 500,000 and
there's no a hard limit about the number of Hive/Impala databases/tables that you can have in the cluster.
Regards,
Chethan YM
Created ‎08-29-2022 03:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
There's no limit about the number of databases in Hive metastore. As a good practice we do not recommend to create tables with more than 10,0000 partitions.
In my opinion, you shouldn't have problem with 2,000 tables. You can expect to have some type of performance issues if you have a total number object greater than 500,000 and
there's no a hard limit about the number of Hive/Impala databases/tables that you can have in the cluster.
Regards,
Chethan YM
Created ‎08-29-2022 04:33 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you for the explanation. To clarify, what is the definition of "total number object"? Does it refer to total of "metadata objects", as defined in this page about Hive Design?
Created ‎08-29-2022 05:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Yes, Hive metastore is a component that stores all the structure information(metadata) of objects like tables and partitions in the warehouse including column and column type information etc...
Regards,
Chethan YM
Note: If this answered your question please accept the reply as a solution.
