- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Multiple Hive Metastore instances and compactor thread
Created ‎06-08-2022 03:19 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
I want to ask a few questions about having multiple Hive Mestastore instances and how to correctly configure the compaction initiator and cleaner threads on these.
The point of having multiple Hive Metastore instances is because we have many concurrent connections and, as we have seen on the official documentation (https://docs.cloudera.com/cdp-private-cloud-upgrade/latest/release-guide/topics/cdpdc-hive.html), it is not recommended to have more than 16-24GB of Java Heap due to the impact of Java garbage collection on active processing by the service. 16-24GB of Java Heap is the recommended amount of memory to manage 41-80 concurrent connections (we have around 50 on each Metastore).
So, what i want to ask is what is the correct configuration for the compactor initiator and cleaner threads (hive.compactor.initiator.on) if we have multiple Hive Metastore instances. Can it be enabled on every instance? Should it be only enabled on 1 instance?
Our environment is running CDH 7.1.7-1 with Hive version 3.1.3000.7.1.7.1000-141.
Thanks in advance,
Kind regards.
Created ‎06-08-2022 08:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Gabriel,
Only 1 metastore should have the compactor enabled
Personally, I would use the last of the metastores listed on hive-site.xml for Hive on TEZ since it normally is the least used
Best.
-JMP
Created ‎06-08-2022 08:39 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello Gabriel,
Only 1 metastore should have the compactor enabled
Personally, I would use the last of the metastores listed on hive-site.xml for Hive on TEZ since it normally is the least used
Best.
-JMP
Created ‎06-09-2022 12:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jose Manuel,
Thank you for your answer. I got confused because i've seen on the Hive Wiki that after Hive version 1.3.0 it may be possible, but i wanted to be sure.
"Before Hive 1.3.0 it's critical that this is enabled on exactly one metastore service instance. As of Hive 1.3.0 this property may be enabled on any number of standalone metastore instances."
Kind regards.
Created ‎09-21-2022 04:20 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Gabriel,
We'd still recommend to have it enabled in only one instance
To avoid race conditions and such
Thank You,
-JMP
