Created 06-08-2020 12:33 AM
We are using Hue 4.6.0 with postgres as its backend and noticed that it performs/renders very slow as the number of active users increases. The webui after login taking several (often >1 or more) minutes to load, running query mostly takes even longer to start often showing red exclamation icon (The query is taking hanging or taking longer than usual) or clicking on different options in HUE renders very slow, taking over a minutes to load.
One observation were made when this happens that the requests.active value under desktop/metrics goes > 12 or 15, it starts to slow down and higher the number, slower it becomes often just blank page. We have enabled the debug and most of the time costly operations are under notebook/api - execute/ check_status,
access INFO 1.2.3.4 <username> - "POST /notebook/api/execute/hive HTTP/1.1" returned in 13895ms 200 1547 (mem: 1031mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/check_status HTTP/1.1" returned in 12429ms 200 76 (mem: 1031mb)
Which we are thinking could be to do with the backend concurrency which we are investigating/testing.
The above behaviour we have tested with an nginx LB infront of Hue and results were same.
Most of the the endusers (and superusers/admins like us) also have this issue during the day (ie., when Hue is busy). We are trying to identify the bottle neck and need assistance.
Created 06-10-2020 03:50 AM
@sbharadi You are going to need to do some Performance Tuning and High Availability in Hue to increase the speed and response of the UI for many concurrent users. I do not think a single instance will satisfy your users.
You can find some info on the follow link:
https://gethue.com/performance-tuning/
These are older articles (check deeper links) and some to do with Cloudera Manager, but the concepts should still apply, especially in terms of basic hue performance ideas; memory, more instances, load balancer in front of more instances, and even adjustments to Hive itself.
Last but not least, you may find a better response over at the Hue Discourse. @Romain is active here but not always watching.
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.
Thanks,
Steven @ DFHZ
Created on 06-09-2020 01:56 AM - edited 06-09-2020 02:12 AM
we have made some config changed in the hue postgres backend db as below and then restarted postgres and hue services expecting to perform a little better
# updated values in postgresql.conf
shared_buffers = 4GB
effective_cache_size = 4GB
maintenance_work_mem = 1GB
wal_buffers = 16MB
But unfortunately it seems that there has been no change. Below are some log snippet showing time taken for a few requests -
access INFO 1.2.3.4 <username> - "POST /notebook/api/check_status HTTP/1.1" returned in 29794ms 200 77 (mem: 332mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/describe/db_name/tbl_name/ HTTP/1.1" returned in 36359ms 200 4270 (mem: 326mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/sample/db_name/tbl_name HTTP/1.1" returned in 38153ms 200 179 (mem: 326mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/execute/hive HTTP/1.1" returned in 41019ms 200 1153 (mem: 313mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/autocomplete/db_lab HTTP/1.1" returned in 80593ms 200 1823 (mem: 332mb)
access INFO 1.2.3.4 <username> - "POST /notebook/download HTTP/1.1" returned in 42439ms 200 8639 (mem: 326mb)
and
access INFO 1.2.3.4 <username> - "GET /desktop/globalJsConstants.js HTTP/1.1" returned in 49975ms 304 0 (mem: 326mb)
access INFO 1.2.3.4 <username> - "GET /desktop/globalJsConstants.js HTTP/1.1" returned in 78294ms 200 65475 (mem: 326mb)
we are still seeing good 30s to >1min delays in Hue web UI. Most of the time costly entries are under /notebook/api.
Is this expected or is there a way to improve these numbers?
Created 06-10-2020 02:52 AM
We have the Hue nodes as edgenodes as part of the cluster so all the clients and their respective configs are already available on/from Hue. We can successfully run a hive query from hive cli from Hue servers and results are returned very quick.
Further more, we have done network sanity check between Hue servers and other master nodes in the cluster like HS2 and HMS. The nslookup works from Hue nodes and resolves all the masters nodes fqdn. Then nc works on HS2 and HMS ports (10000/9083) from Hue nodes too.
We also enabled cheeryPy server in hue.ini explicitly even-though configuration said it will be defaulted to that.
# added/updated in hue.ini
use_cherrypy_server=true
cherrypy_server_threads=50
And then restarted services, but not seeing much or any improvements. Below are the latest log snippet
access INFO 1.2.3.4 <username> - "GET /desktop/globalJsConstants.js HTTP/1.1" returned in 90056ms 200 65484 (mem: 316mb)
access INFO 1.2.3.4 <username> - "GET /desktop/globalJsConstants.js HTTP/1.1" returned in 81560ms 200 65485 (mem: 316mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/check_status HTTP/1.1" returned in 35355ms 200 76 (mem: 316mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/check_status HTTP/1.1" returned in 35324ms 200 77 (mem: 316mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/check_status HTTP/1.1" returned in 35272ms 200 77 (mem: 316mb)
access INFO 1.2.3.4 <username> - "POST /notebook/download HTTP/1.1" returned in 35708ms 200 7869 (mem: 316mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/execute/hive HTTP/1.1" returned in 34088ms 200 604 (mem: 316mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/execute/hive HTTP/1.1" returned in 33781ms 200 1217 (mem: 316mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/get_logs HTTP/1.1" returned in 31527ms 200 2597 (mem: 316mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/get_logs HTTP/1.1" returned in 31392ms 200 5848 (mem: 316mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/close_statement HTTP/1.1" returned in 31324ms 200 71 (mem: 313mb)
access INFO 1.2.3.4 <username> - "POST /notebook/api/close_statement HTTP/1.1" returned in 30807ms 200 72 (mem: 333mb)
We appreciate any pointers and help on this issue.
Thank you.
Created 06-10-2020 03:50 AM
@sbharadi You are going to need to do some Performance Tuning and High Availability in Hue to increase the speed and response of the UI for many concurrent users. I do not think a single instance will satisfy your users.
You can find some info on the follow link:
https://gethue.com/performance-tuning/
These are older articles (check deeper links) and some to do with Cloudera Manager, but the concepts should still apply, especially in terms of basic hue performance ideas; memory, more instances, load balancer in front of more instances, and even adjustments to Hive itself.
Last but not least, you may find a better response over at the Hue Discourse. @Romain is active here but not always watching.
If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.
Thanks,
Steven @ DFHZ
Created 06-11-2020 06:17 AM
Thank you @stevenmatison appreciate the pointers provided.
We are not using conventional nginx as load-balancer but instead have BigIP F5. And after examining configuration there we have identified connection limit parameter which was the culprit. We have updated it and this has made some difference in the user experience.
We are looking to increase the number of Hue and HS2 instances in the cluster to boost the overall and query performance.
Many thanks.