Created 11-09-2015 01:47 PM
Do we have recommendations for HiveServer2 sizing? I found on google a reference of max 40 users per hiveserver2 instance with 12GB/RAM? I know it can vary a lot based on usage, but do we have any reference number to start?
Created 11-09-2015 01:51 PM
I think the answer depends much more on the nr. of queries per second than on RAM. 1GB is not enough but the moment you have 8-12 you should be fine outside of very specific usecases. The problem is more that hiveserver reaches limits when you run 10-15 queries per second. It is better in 2.3 which has parallel planning but it will not be able to do much more than 10-20 q/s in any case. Adding more RAM will not help you but increasing the number of parallel server threads and obviously adding additional hive servers.
Obviously in most situations hive server will not be the bottleneck when you run into these kinds of query numbers
Created 11-09-2015 01:47 PM
Created 11-09-2015 01:51 PM
It will vary based on the nature of data, queries and node specs. Generally speaking, the hiveserver2 endpoint can now be clustered and scaled out horizontally. Users could be load-balanced across this farm. It's not so much about creating a perfectly-sized single instance, but rather have a good starting point (e.g. from that article), employ consistent ways of monitoring the process and node and experiment with a specific cluster deployment.
Created 11-09-2015 01:51 PM
I think the answer depends much more on the nr. of queries per second than on RAM. 1GB is not enough but the moment you have 8-12 you should be fine outside of very specific usecases. The problem is more that hiveserver reaches limits when you run 10-15 queries per second. It is better in 2.3 which has parallel planning but it will not be able to do much more than 10-20 q/s in any case. Adding more RAM will not help you but increasing the number of parallel server threads and obviously adding additional hive servers.
Obviously in most situations hive server will not be the bottleneck when you run into these kinds of query numbers
Created 11-10-2015 01:05 AM
Created 07-14-2016 04:41 PM
I am experiencing dead slowness in Hiveserver2 performance on untilized cluster with capacity of 28 TB. Have checked all the configurations, Namenode RPC, LDAP, Network etc. All is well there. I have HDP2.3.4 and Ambari2.2.1. I am curious about Embedded Metastore within Hiverserver2.