Support Questions

bushnoh · ‎02-17-2017

Hi,

I am configuring Impala to use a load balancer and Kerberos. I have this setup working, however I am unable to query each daemon directly. Is this normal behavior?

Showing a successful and unsuccessful query:

[centos@kf0 ~]$ klist
Ticket cache: FILE:/tmp/krb5cc_1000
Default principal: alex@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK

Valid starting     Expires            Service principal
02/17/17 13:08:30  02/18/17 13:08:30  krbtgt/CDH-POC-CLUSTER.INTERNAL.CDHNETWORK@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
	renew until 02/24/17 13:08:30
02/17/17 13:08:51  02/18/17 13:08:30  impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
	renew until 02/24/17 13:08:30
02/17/17 13:14:00  02/18/17 13:08:30  impala/dn2.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
	renew until 02/24/17 13:08:30
02/17/17 13:27:16  02/18/17 13:08:30  impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
	renew until 02/24/17 13:08:30
[centos@kf0 ~]$ impala-shell --ssl --impalad=lb.cdh-poc-cluster.internal.cdhnetwork:21000 -q "show tables" --ca_cert "/etc/ipa/ca.crt" -k -V
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
SSL is enabled
Connected to lb.cdh-poc-cluster.internal.cdhnetwork:21000
Server version: impalad version 2.7.0-cdh5.10.0 RELEASE (build 785a073cd07e2540d521ecebb8b38161ccbd2aa2)
Query: show tables

Fetched 0 row(s) in 0.43s
[centos@kf0 ~]$ impala-shell --ssl --impalad=dn1.cdh-poc-cluster.internal.cdhnetwork:21000 -q "show tables" --ca_cert "/etc/ipa/ca.crt" -k -V
Starting Impala Shell using Kerberos authentication
Using service name 'impala'
SSL is enabled
Error connecting: TTransportException, TSocket read 0 bytes
Not connected to Impala, could not execute queries.

In the logs I see:

E0217 13:27:36.607559  6262 authentication.cc:160] SASL message (Kerberos (external)): GSSAPI Error: Unspecified GSS failure.  Minor code may provide more information (Request ticket server impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK found in keytab but does not match server principal impala/lb.cdh-poc-cluster.internal.cdhnetwork@)
I0217 13:27:36.625763  6262 thrift-util.cc:111] SSL_shutdown: error code: 0
I0217 13:27:36.625901  6262 thrift-util.cc:111] TThreadPoolServer: TServerTransport died on accept: SASL(-13): authentication failure: GSSAPI Failure: gss_accept_sec_context

However in the keytab file I see the dn1 princ is there:

[root@dn1 impalad]# klist -kt /run/cloudera-scm-agent/process/64-impala-IMPALAD/impala.keytab
Keytab name: FILE:/run/cloudera-scm-agent/process/64-impala-IMPALAD/impala.keytab
KVNO Timestamp           Principal
---- ------------------- ------------------------------------------------------
   1 02/17/2017 12:03:52 impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
   1 02/17/2017 12:03:52 impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
   1 02/17/2017 12:03:52 impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
   1 02/17/2017 12:03:52 impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
[root@dn1 impalad]#

And the daemon princs are set correctly:

[root@dn1 impalad]# cat /run/cloudera-scm-agent/process/64-impala-IMPALAD/impala-conf/impalad_flags | grep princ
-principal=impala/lb.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
-be_principal=impala/dn1.cdh-poc-cluster.internal.cdhnetwork@CDH-POC-CLUSTER.INTERNAL.CDHNETWORK
[root@dn1 impalad]#

So is this normal behaviour that the daemons can no longer be queried directly once Kerberos has been enabled when using a load balancer, or am I doing something wrong?

Thanks

venkatsambath · ‎02-19-2017

@bushnoh

This is normal. Once you setup loadbalancer infront of impalad, the impalad will expose itself through the service principal name(SPN) of the loadbalancer to the external client.

If you check the varz page of individual impalad, you can notice following parameters

https://<impalad-hostname>:25000/varz

principal ==> LB SPN
be_principal ==> IMPALAD SPN

This shows that impalad expects LB's SPN for clients communication whereas for internal communication[within impalad's] it uses its own SPN. be_principal --> Backend principal for internal communication.

hence it is required to contact the impalad with LB's SPN.

View solution in original post

venkatsambath · ‎03-05-2017

First of all. I am sorry for getting back late on this question.

One of the key factor of kerberos authentication is its reliability on DNS reverse resolution.

Quote from MIT KDC - https://web.mit.edu/kerberos/krb5-1.4/krb5-1.4.4/doc/krb5-admin/Getting-DNS-Information-Correct.html
----
[1]
 Getting DNS Information Correct
Several aspects of Kerberos rely on name service. In order for Kerberos to provide its high level of security, it is less forgiving of name service problems than some other parts of your network. It is important that your Domain Name System (DNS) entries and your hosts have the correct information.
----

Lets say the virtual ip as haproxy.com

And loadbalancers are running on below nodes

haproxy1.com - 10.0.0.1
haproxy2.com - 10.0.0.2
haproxy3.com - 10.0.0.3

impalad running on nodes

impalad1 - 10.0.0.4
impalad2 - 10.0.0.5
impalad3 - 10.0.0.6

====
# Forward resolution configs for DNS

haproxy1.com IN A 10.0.0.1 

haproxy.com IN CNAME haproxy1.com

====

Now haproxy.com resolves to ip 10.0.0.1
reverse resolution of ip 10.0.0.1 will result in answer haproxy1.com. This breaks the expectation of kerberos [1] authentication, so service ticket request will fail when you run impala-shell -i haproxy.com

[2]
So our aim is to achieve DNS resolution like this.
haproxy.com -> 10.0.0.1
10.0.0.1 -> haproxy.com 

We can now alter the reverse resolution of DNS to achieve this

Reverse zone configuration:

====
inverse-query-haproxy1.com IN PTR haproxy1.com
10.0.0.1 IN CNAME inverse-query-haproxy1.com
====

With these above set of configs we can achieve forward and reverse resolution as expected in [2]

Caution Note: 
If you run CM agents on one of the proxy machines, i.e. its a part of the cluster, its identity will have to change permanently to the VIP name, because reverse DNS will now never show the original hostname, which could cause other services to have issues unless listening_hostname is configured to use the VIP name. Ideally the haproxy machine should not be added as part of the cluster in CM hosts control, to avoid this from happening - it should be a standalone box.

View solution in original post

saranvisa · ‎02-17-2017

@bushnoh

it looks normal to me.. because Impala daemon will be available in all the nodes (in general), but server will be in one node (may be additional nodes if you have HA)... so no need to connect to every individual nodes in the Distributed system

mbigelow · ‎02-17-2017

Give it a try with the -k switch. This lets the shell know that it should authenticate with Kerberos and this is off by default. The logs also tell me that the server side was expected Kerberos auth but it failed that. I don't know why it works to the LB. Is the LB doing pass through authentication?

I haven't worked with an LB in front of Impala, so I don't know if this is normal. I would hope that I could still query each one if needed. I am curious as I am looking to add a LB for Impala soon.

csguna · ‎02-17-2017

I believe you have Configured a seperate host that act as a proxy ,making it to handle the request along with kerberos . Hence I think you wont be able to by pass the proxy because it works like a session facade

https://www.cloudera.com/documentation/enterprise/5-2-x/topics/impala_proxy.html#proxy_kerberos

venkatsambath · ‎02-19-2017

@bushnoh

This is normal. Once you setup loadbalancer infront of impalad, the impalad will expose itself through the service principal name(SPN) of the loadbalancer to the external client.

If you check the varz page of individual impalad, you can notice following parameters

https://<impalad-hostname>:25000/varz

principal ==> LB SPN
be_principal ==> IMPALAD SPN

This shows that impalad expects LB's SPN for clients communication whereas for internal communication[within impalad's] it uses its own SPN. be_principal --> Backend principal for internal communication.

hence it is required to contact the impalad with LB's SPN.

bushnoh · ‎02-20-2017

@venkatsambath Thanks for the confirmation. I'm just thinking in terms of high-availability, have we now introduced a single point of failure for Impala? How would you make the load balancer highly available?

You could have multiple load balancers and use DNS CNAMEs to balance these, however since we are using Kerberos the domain name we point requests at must be an A record so this won't work.

Is any of my thinking around the above incorrect? How would be make the service highly available when using load balancers and kerberos for impala?

venkatsambath · ‎03-05-2017

First of all. I am sorry for getting back late on this question.

One of the key factor of kerberos authentication is its reliability on DNS reverse resolution.

Quote from MIT KDC - https://web.mit.edu/kerberos/krb5-1.4/krb5-1.4.4/doc/krb5-admin/Getting-DNS-Information-Correct.html
----
[1]
 Getting DNS Information Correct
Several aspects of Kerberos rely on name service. In order for Kerberos to provide its high level of security, it is less forgiving of name service problems than some other parts of your network. It is important that your Domain Name System (DNS) entries and your hosts have the correct information.
----

Lets say the virtual ip as haproxy.com

And loadbalancers are running on below nodes

haproxy1.com - 10.0.0.1
haproxy2.com - 10.0.0.2
haproxy3.com - 10.0.0.3

impalad running on nodes

impalad1 - 10.0.0.4
impalad2 - 10.0.0.5
impalad3 - 10.0.0.6

====
# Forward resolution configs for DNS

haproxy1.com IN A 10.0.0.1 

haproxy.com IN CNAME haproxy1.com

====

Now haproxy.com resolves to ip 10.0.0.1
reverse resolution of ip 10.0.0.1 will result in answer haproxy1.com. This breaks the expectation of kerberos [1] authentication, so service ticket request will fail when you run impala-shell -i haproxy.com

[2]
So our aim is to achieve DNS resolution like this.
haproxy.com -> 10.0.0.1
10.0.0.1 -> haproxy.com 

We can now alter the reverse resolution of DNS to achieve this

Reverse zone configuration:

====
inverse-query-haproxy1.com IN PTR haproxy1.com
10.0.0.1 IN CNAME inverse-query-haproxy1.com
====

With these above set of configs we can achieve forward and reverse resolution as expected in [2]

Caution Note: 
If you run CM agents on one of the proxy machines, i.e. its a part of the cluster, its identity will have to change permanently to the VIP name, because reverse DNS will now never show the original hostname, which could cause other services to have issues unless listening_hostname is configured to use the VIP name. Ideally the haproxy machine should not be added as part of the cluster in CM hosts control, to avoid this from happening - it should be a standalone box.

venkatsambath · ‎03-05-2017

Command to test reverse resolution is described in this link

http://linuxcommando.blogspot.in/2008/07/how-to-do-reverse-dns-lookup.html

bushnoh · ‎03-06-2017

@venkatsambath This is great information, many thanks!

ChandeepS · ‎09-28-2017

As an alternative, you could enable LDAP for Impala and then connect to the slaves directly thus bypassing Kerberos and the load balancer.

Cloudera Community

Support Questions

Impala: Querying daemon directly when using Kerberos and Load balancer