Created on 03-20-2018 04:17 PM - edited 09-16-2022 06:00 AM
HAproxy is being used for load-balancing among the Impala daemons, and we've enabled logging to a remote server. However, there are hardly any logs being produced, and only at certain times such as when the Imapla roles are restarted.
We would like to be able to check the haproxy logs, and be able to find which Impala daemon a particular query was being executed on. Does anyone know if a configuration change is required for haproxy or Impala? This seems like something that haproxy should log messages about, and thus, fairly standard.
Attached is the /etc/haproxy/haproxy.cfg file.
Thank you
Created on 03-28-2018 11:22 AM - edited 03-28-2018 11:23 AM
All I'm seeing are messages like the following when restarting haproxy:
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy static started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impala started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy app started.
And sometimes other messages like the following in the haproxy log - but they don't come in immediately after running an Impala query over JDBC such as from Hue or beeline:
2018-03-27T15:51:47-04:00 172.28.2.234 haproxy[4286]: 172.28.6.234:37768 [27/Mar/2018:15:51:47.621] impalajdbc impalajdbc/impalajdbc1 0/0/+0 +0 -- 2/2/2/1/0 0/0 2018-03-27T18:08:08-04:00 172.28.2.234 haproxy[4286]: 172.28.2.20:39978 [27/Mar/2018:18:08:08.888] impala impala/impalad1 0/0/+0 +0 -- 3/1/1/1/0 0/0
However, no logs come in when accessing from a BI tool such as Tableau using its native Cloudera Impala connector or the Cloudera Impala ODBC driver.
Is there a way to increase the logging for haproxy so that we can know which Impala Daemon a query is executing on for the purpose of debugging potential issues of someone accessing from a BI application? I already have it said to debug under the listen section for impalajdbc.
listen impalajdbc :21051 mode tcp option tcplog balance roundrobin log 172.28.xx.xx local2 debug server impalajdbc1 hdp104v.cmssvc.local:21050 server impalajdbc2 hdp105v.cmssvc.local:21050
Thanks,
Braz
Created 03-21-2018 06:13 AM
I see that you mentioned attaching a file but do not see it. If you were unsuccessful can you reach out to me via private message so I can investigate the posting issue?
Created on 03-21-2018 09:28 AM - edited 03-21-2018 09:32 AM
Below are the contents of the haproxy.cfg file:
#--------------------------------------------------------------------- # Example configuration for a possible web application. See the # full configuration options online. # # http://haproxy.1wt.eu/download/1.4/doc/configuration.txt # #--------------------------------------------------------------------- #--------------------------------------------------------------------- # Global settings #--------------------------------------------------------------------- global # to have these messages end up in /var/log/haproxy.log you will # need to: # # 1) configure syslog to accept network log events. This is done # by adding the '-r' option to the SYSLOGD_OPTIONS in # /etc/sysconfig/syslog # # 2) configure local2 events to go to the /var/log/haproxy.log # file. A line like the following can be added to # /etc/sysconfig/syslog # # local2.* /var/log/haproxy.log # log 172.28.xx.xx local2 chroot /var/lib/haproxy pidfile /var/run/haproxy.pid maxconn 4000 user haproxy group haproxy daemon # turn on stats unix socket # stats socket /var/lib/haproxy/stats #--------------------------------------------------------------------- # common defaults that all the 'listen' and 'backend' sections will # use if not designated in their block #--------------------------------------------------------------------- defaults mode tcp log global option tcplog option dontlognull # option http-server-close # option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout http-request 10s timeout queue 1m timeout connect 10s #Values increased for fixing timeout issue with impala queries #timeout client 1m #timeout server 1m timeout client 120m timeout server 120m timeout http-keep-alive 10s timeout check 10s maxconn 3000 #--------------------------------------------------------------------- # main frontend which proxys to the backends #--------------------------------------------------------------------- frontend main *:5000 acl url_static path_beg -i /static /images /javascript /stylesheets acl url_static path_end -i .jpg .gif .png .css .js use_backend static if url_static default_backend app option tcplog log 172.28.xx.xx local2 debug #--------------------------------------------------------------------- # static backend for serving up images, stylesheets and such #--------------------------------------------------------------------- backend static balance roundrobin #server static 127.0.0.1:4331 check #This sets up the admin page for HA proxy at port 25002 #listen stats :25002 # listen stats # bind hdp004v.cmssvc.local:20020 ssl crt /data/security/hdp004v/hdp004v.crt.pem # balance # mode http # stats enable # stats auth username:password # stats uri /haproxy?stats #This is the setup for Imapala. Impala client connect to load_balancer_host:25003 #HAProxy will balance connections among the list of servers listed below. # The list of impalad is listening at port 21000 for beeswax (impala-shell) or original ODBC driver. # For JDBC or ODBC version 2.x driver, use port 21050 instead of 21000 listen impala :25003 mode tcp option tcplog balance leastconn server impalad1 hdp104v.cmssvc.local:21000 server impalad2 hdp105v.cmssvc.local:21000 # setup for hue or toher JDBC-enabled applications. Hue requires sticky sessions. # The application connects to load_balancer_host:21051, and HAProxy balances # connections to the associated hosts, where Imapala listens for JDBC # requests on port 21050 listen impalajdbc :21051 mode tcp option tcplog balance source log 172.28.xx.xx local2 debug server impalajdbc1 hdp104v.cmssvc.local:21050 server impalajdbc2 hdp105v.cmssvc.local:21050 #--------------------------------------------------------------------- # round robin balancing between the various backends #--------------------------------------------------------------------- backend app balance roundrobin #server app1 127.0.0.1:5001 check #server app2 127.0.0.1:5002 check #server app3 127.0.0.1:5003 check #server app4 127.0.0.1:5004 check
Thanks,
Braz
Created on 03-28-2018 11:22 AM - edited 03-28-2018 11:23 AM
All I'm seeing are messages like the following when restarting haproxy:
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy static started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impala started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy app started.
And sometimes other messages like the following in the haproxy log - but they don't come in immediately after running an Impala query over JDBC such as from Hue or beeline:
2018-03-27T15:51:47-04:00 172.28.2.234 haproxy[4286]: 172.28.6.234:37768 [27/Mar/2018:15:51:47.621] impalajdbc impalajdbc/impalajdbc1 0/0/+0 +0 -- 2/2/2/1/0 0/0 2018-03-27T18:08:08-04:00 172.28.2.234 haproxy[4286]: 172.28.2.20:39978 [27/Mar/2018:18:08:08.888] impala impala/impalad1 0/0/+0 +0 -- 3/1/1/1/0 0/0
However, no logs come in when accessing from a BI tool such as Tableau using its native Cloudera Impala connector or the Cloudera Impala ODBC driver.
Is there a way to increase the logging for haproxy so that we can know which Impala Daemon a query is executing on for the purpose of debugging potential issues of someone accessing from a BI application? I already have it said to debug under the listen section for impalajdbc.
listen impalajdbc :21051 mode tcp option tcplog balance roundrobin log 172.28.xx.xx local2 debug server impalajdbc1 hdp104v.cmssvc.local:21050 server impalajdbc2 hdp105v.cmssvc.local:21050
Thanks,
Braz