Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HiveServer2 sizing

avatar
Master Mentor

Do we have recommendations for HiveServer2 sizing? I found on google a reference of max 40 users per hiveserver2 instance with 12GB/RAM? I know it can vary a lot based on usage, but do we have any reference number to start?

1 ACCEPTED SOLUTION

avatar
Master Guru

I think the answer depends much more on the nr. of queries per second than on RAM. 1GB is not enough but the moment you have 8-12 you should be fine outside of very specific usecases. The problem is more that hiveserver reaches limits when you run 10-15 queries per second. It is better in 2.3 which has parallel planning but it will not be able to do much more than 10-20 q/s in any case. Adding more RAM will not help you but increasing the number of parallel server threads and obviously adding additional hive servers.

Obviously in most situations hive server will not be the bottleneck when you run into these kinds of query numbers

View solution in original post

5 REPLIES 5

avatar
Master Mentor

avatar

It will vary based on the nature of data, queries and node specs. Generally speaking, the hiveserver2 endpoint can now be clustered and scaled out horizontally. Users could be load-balanced across this farm. It's not so much about creating a perfectly-sized single instance, but rather have a good starting point (e.g. from that article), employ consistent ways of monitoring the process and node and experiment with a specific cluster deployment.

avatar
Master Guru

I think the answer depends much more on the nr. of queries per second than on RAM. 1GB is not enough but the moment you have 8-12 you should be fine outside of very specific usecases. The problem is more that hiveserver reaches limits when you run 10-15 queries per second. It is better in 2.3 which has parallel planning but it will not be able to do much more than 10-20 q/s in any case. Adding more RAM will not help you but increasing the number of parallel server threads and obviously adding additional hive servers.

Obviously in most situations hive server will not be the bottleneck when you run into these kinds of query numbers

avatar
Master Mentor

avatar
New Contributor

I am experiencing dead slowness in Hiveserver2 performance on untilized cluster with capacity of 28 TB. Have checked all the configurations, Namenode RPC, LDAP, Network etc. All is well there. I have HDP2.3.4 and Ambari2.2.1. I am curious about Embedded Metastore within Hiverserver2.