Created on 03-20-2018 04:17 PM - edited 09-16-2022 06:00 AM
HAproxy is being used for load-balancing among the Impala daemons, and we've enabled logging to a remote server. However, there are hardly any logs being produced, and only at certain times such as when the Imapla roles are restarted.
We would like to be able to check the haproxy logs, and be able to find which Impala daemon a particular query was being executed on. Does anyone know if a configuration change is required for haproxy or Impala? This seems like something that haproxy should log messages about, and thus, fairly standard.
Attached is the /etc/haproxy/haproxy.cfg file.
Thank you
Created on 03-28-2018 11:22 AM - edited 03-28-2018 11:23 AM
All I'm seeing are messages like the following when restarting haproxy:
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy static started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impala started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy app started.
And sometimes other messages like the following in the haproxy log - but they don't come in immediately after running an Impala query over JDBC such as from Hue or beeline:
2018-03-27T15:51:47-04:00 172.28.2.234 haproxy[4286]: 172.28.6.234:37768 [27/Mar/2018:15:51:47.621] impalajdbc impalajdbc/impalajdbc1 0/0/+0 +0 -- 2/2/2/1/0 0/0 2018-03-27T18:08:08-04:00 172.28.2.234 haproxy[4286]: 172.28.2.20:39978 [27/Mar/2018:18:08:08.888] impala impala/impalad1 0/0/+0 +0 -- 3/1/1/1/0 0/0
However, no logs come in when accessing from a BI tool such as Tableau using its native Cloudera Impala connector or the Cloudera Impala ODBC driver.
Is there a way to increase the logging for haproxy so that we can know which Impala Daemon a query is executing on for the purpose of debugging potential issues of someone accessing from a BI application? I already have it said to debug under the listen section for impalajdbc.
listen impalajdbc :21051 mode tcp option tcplog balance roundrobin log 172.28.xx.xx local2 debug server impalajdbc1 hdp104v.cmssvc.local:21050 server impalajdbc2 hdp105v.cmssvc.local:21050
Thanks,
Braz
Created 03-21-2018 06:13 AM
I see that you mentioned attaching a file but do not see it. If you were unsuccessful can you reach out to me via private message so I can investigate the posting issue?
Created on 03-21-2018 09:28 AM - edited 03-21-2018 09:32 AM
Below are the contents of the haproxy.cfg file:
#---------------------------------------------------------------------
# Example configuration for a possible web application. See the
# full configuration options online.
#
# http://haproxy.1wt.eu/download/1.4/doc/configuration.txt
#
#---------------------------------------------------------------------
#---------------------------------------------------------------------
# Global settings
#---------------------------------------------------------------------
global
# to have these messages end up in /var/log/haproxy.log you will
# need to:
#
# 1) configure syslog to accept network log events. This is done
# by adding the '-r' option to the SYSLOGD_OPTIONS in
# /etc/sysconfig/syslog
#
# 2) configure local2 events to go to the /var/log/haproxy.log
# file. A line like the following can be added to
# /etc/sysconfig/syslog
#
# local2.* /var/log/haproxy.log
#
log 172.28.xx.xx local2
chroot /var/lib/haproxy
pidfile /var/run/haproxy.pid
maxconn 4000
user haproxy
group haproxy
daemon
# turn on stats unix socket
# stats socket /var/lib/haproxy/stats
#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
mode tcp
log global
option tcplog
option dontlognull
# option http-server-close
# option forwardfor except 127.0.0.0/8
option redispatch
retries 3
timeout http-request 10s
timeout queue 1m
timeout connect 10s
#Values increased for fixing timeout issue with impala queries
#timeout client 1m
#timeout server 1m
timeout client 120m
timeout server 120m
timeout http-keep-alive 10s
timeout check 10s
maxconn 3000
#---------------------------------------------------------------------
# main frontend which proxys to the backends
#---------------------------------------------------------------------
frontend main *:5000
acl url_static path_beg -i /static /images /javascript /stylesheets
acl url_static path_end -i .jpg .gif .png .css .js
use_backend static if url_static
default_backend app
option tcplog
log 172.28.xx.xx local2 debug
#---------------------------------------------------------------------
# static backend for serving up images, stylesheets and such
#---------------------------------------------------------------------
backend static
balance roundrobin
#server static 127.0.0.1:4331 check
#This sets up the admin page for HA proxy at port 25002
#listen stats :25002
# listen stats
# bind hdp004v.cmssvc.local:20020 ssl crt /data/security/hdp004v/hdp004v.crt.pem
# balance
# mode http
# stats enable
# stats auth username:password
# stats uri /haproxy?stats
#This is the setup for Imapala. Impala client connect to load_balancer_host:25003
#HAProxy will balance connections among the list of servers listed below.
# The list of impalad is listening at port 21000 for beeswax (impala-shell) or original ODBC driver.
# For JDBC or ODBC version 2.x driver, use port 21050 instead of 21000
listen impala :25003
mode tcp
option tcplog
balance leastconn
server impalad1 hdp104v.cmssvc.local:21000
server impalad2 hdp105v.cmssvc.local:21000
# setup for hue or toher JDBC-enabled applications. Hue requires sticky sessions.
# The application connects to load_balancer_host:21051, and HAProxy balances
# connections to the associated hosts, where Imapala listens for JDBC
# requests on port 21050
listen impalajdbc :21051
mode tcp
option tcplog
balance source
log 172.28.xx.xx local2 debug
server impalajdbc1 hdp104v.cmssvc.local:21050
server impalajdbc2 hdp105v.cmssvc.local:21050
#---------------------------------------------------------------------
# round robin balancing between the various backends
#---------------------------------------------------------------------
backend app
balance roundrobin
#server app1 127.0.0.1:5001 check
#server app2 127.0.0.1:5002 check
#server app3 127.0.0.1:5003 check
#server app4 127.0.0.1:5004 checkThanks,
Braz
Created on 03-28-2018 11:22 AM - edited 03-28-2018 11:23 AM
All I'm seeing are messages like the following when restarting haproxy:
2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy static started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impala started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy main started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy impalajdbc started. 2018-03-28T11:32:41-04:00 172.28.2.234 haproxy[26642]: Proxy app started.
And sometimes other messages like the following in the haproxy log - but they don't come in immediately after running an Impala query over JDBC such as from Hue or beeline:
2018-03-27T15:51:47-04:00 172.28.2.234 haproxy[4286]: 172.28.6.234:37768 [27/Mar/2018:15:51:47.621] impalajdbc impalajdbc/impalajdbc1 0/0/+0 +0 -- 2/2/2/1/0 0/0 2018-03-27T18:08:08-04:00 172.28.2.234 haproxy[4286]: 172.28.2.20:39978 [27/Mar/2018:18:08:08.888] impala impala/impalad1 0/0/+0 +0 -- 3/1/1/1/0 0/0
However, no logs come in when accessing from a BI tool such as Tableau using its native Cloudera Impala connector or the Cloudera Impala ODBC driver.
Is there a way to increase the logging for haproxy so that we can know which Impala Daemon a query is executing on for the purpose of debugging potential issues of someone accessing from a BI application? I already have it said to debug under the listen section for impalajdbc.
listen impalajdbc :21051 mode tcp option tcplog balance roundrobin log 172.28.xx.xx local2 debug server impalajdbc1 hdp104v.cmssvc.local:21050 server impalajdbc2 hdp105v.cmssvc.local:21050
Thanks,
Braz