Created on 07-28-2017 04:56 PM - edited 09-16-2022 05:00 AM
I am trying to connect to Impala through from the edge node of a cluster via HA Proxy. I've verified HAProxy is up and runninng by using it to connect to other services (Hue, for example), but when I enter the below command I receive the following error:
-sh-4.2$ impala-shell -i haproxy1:21000 -k --ssl
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
SSL is enabled. Impala server certificates will NOT be verified (set --ca_cert to change)
Error connecting: TTransportException, Could not start SASL: Error in sasl_client_start (-1) SASL(-1): generic failure: GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (Server not found in Kerberos database)
The Impala settings in HA Proxy are shown below. Based on what is outlined at https://www.cloudera.com/documentation/enterprise/5-2-x/topics/impala_proxy.html it seems I've covered all of the standard steps. Is there anything else that needs to be configured for HA Proxy to work as a load balancer for Impala?
# IMPALA
listen impala :21000
# bind *:21000
mode tcp
option tcplog
balance leastconn
server worker1 worker1.name:21000
server worker2 worker2.name:21000
server worker3 worker3.name:21000
server worker4 worker4.name:21000
listen impalajdbc :21050
# bind *:21050
mode tcp
option tcplog
balance source
server worker1 worker1.name:21000
server worker2 worker2.name:21000
server worker3 worker3.name:21000
server worker4 worker4.name:21000
Created 07-28-2017 07:07 PM
Created 07-31-2017 03:00 PM
It does seem like the issue is due to there not being a kerberos ticket for the proxy server. However, I thought that setting "Impala Daemons Load Balancer" to "haproxy_node_name:21000" would take care of this. Per the doc:
Impala Daemons Load Balancer: Address of the load balancer used for Impala daemons. Should be specified in host:port format. If this is specified and Kerberos is enabled, Cloudera Manager adds a principal for 'impala/<load_balancer_host>@<realm>' to the keytab for all Impala daemons.
Created 07-31-2017 03:08 PM
Created 07-31-2017 03:47 PM
Created 07-31-2017 04:03 PM
Created 07-31-2017 04:44 PM
Yes - the FQDN is of the format:
haproxy.company.local
As well, the principal looks like:
impala/haproxy.company.local@COMPANY.LOCAL
Which mirrors the other principals (for example: impala/master-123.company.local@COMPANY.LOCAL)
Created 07-31-2017 04:53 PM
Created 08-01-2017 03:34 PM
Using the FQDN in the impala-shell statement results in the same error. SSL was configured following: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/Load_Balancer_Administ...
I've verified the changes outlined there were made to haproxy-https.xml and the SEL linux settings are correct. As well, a self-signed cert was used to contruct the pem file in /etc/ssl/private.
As for the last part - how do I ensure that Impala is configured to use a particular PEM file? Is there a relevant config setting?
Created 08-01-2017 04:18 PM
Following the Cloudera Doc @ https://www.cloudera.com/documentation/enterprise/5-11-x/topics/impala_proxy.html one potential issue I see is:
After modifying The Impala Daemons Load Balancer field, the keytab files of all the workers running Impala have the haproxy principal present. The calling klist on a worker's keytab file...
1 08/01/2017 15:25:11 impala/worker1.company.local@COMPANY.LOCAL
1 08/01/2017 15:25:11 impala/worker2.company.local@COMPANY.LOCAL
1 08/01/2017 15:25:11 impala/worker3.company.local@COMPANY.LOCAL
1 08/01/2017 15:25:11 impala/haproxy1.company.local@COMPANY.LOCAL
1 08/01/2017 15:25:11 impala/haproxy1.company.local@COMPANY.LOCAL
1 08/01/2017 15:25:11 impala/haproxy1.company.local@COMPANY.LOCAL
It looks like impala principal for haproxy is correctly present. However, I don't believe there is a keytab present on the haproxy node itself. Does there need to be?