Created on 09-21-2016 07:39 AM - edited 09-16-2022 03:40 AM
Hi,
after we enabled Kerberos (AD) on the cluster using the Wizard, Impala failed to start, meanwhile the other services didn't have problems.
It seems that the Principal passed at the StateStore from the Catalog Server is wrong, even if it's configuration has no issues, indeed, as you can see in the logs, the principal has the "at" at the end without the Realm.
Here there are some informations that can be useful:
OS: Red Hat Enterprise Linux Server release 7.2 (Maipo)
CDH Parcel: 5.8.0-1.cdh5.8.0.p0.42
Cloudera Manager: 5.8.1
Service hosts:
Configurations:
krb5.conf:
[libdefaults] default_realm = REALM.COM dns_lookup_kdc = false dns_lookup_realm = false ticket_lifetime = 86400 renew_lifetime = 604800 forwardable = true default_tgs_enctypes = aes256-cts-hmac-sha1-96 rc4-hmac des-cbc-crc des-cbc-md5 default_tkt_enctypes = aes256-cts-hmac-sha1-96 rc4-hmac des-cbc-crc des-cbc-md5 permitted_enctypes = aes256-cts-hmac-sha1-96 rc4-hmac des-cbc-crc des-cbc-md5 udp_preference_limit = 1 kdc_timeout = 3000 [realms] REALM.COM = { kdc = realm.com admin_server = realm.com }
Catalog Server stack trace:
F0921 12:46:36.956574 24185 catalogd-main.cc:76] RPC Error: write() send(): Broken pipe . Impalad exiting. *** Check failure stack trace: *** @ 0x1b465dd (unknown) @ 0x1b48f06 (unknown) @ 0x1b460fd (unknown) @ 0x1b499ae (unknown) @ 0x7f77db (unknown) @ 0x7c39c6 (unknown) @ 0x7f1289e7cb15 __libc_start_main @ 0x7f658d (unknown) Picked up JAVA_TOOL_OPTIONS: -Xms4294967296 -Xmx4294967296 -XX:OnOutOfMemoryError=/usr/lib64/cmf/service/common/killparent.sh
StateStore stack trace:
E0921 12:46:36.836042 23719 authentication.cc:155] SASL message (Kerberos (internal)): GSSAPI Error: Unspecified GSS failure. Minor code may provide more information (No key table entry found matching impala/clouderavalm1@)
Here there are the flags auto generated
Catalog:
-catalog_service_port=26000 -max_log_files=10 -enable_webserver=true -load_auth_to_local_rules=true -load_catalog_in_background=true -webserver_port=25020 -kerberos_reinit_interval=60 -principal=impala/CLOUDERAVALM1@REALM.COM -keytab_file=/run/cloudera-scm-agent/process/818-impala-CATALOGSERVER/impala.keytab -log_filename=catalogd -statestore_subscriber_timeout_seconds=30 -state_store_host=CLOUDERAVALM1 -state_store_port=24000 -minidump_path=/var/log/impala-minidumps -max_minidumps=9
StateStore:
-state_store_pending_task_count_max=0 -max_log_files=10 -state_store_port=24000 -enable_webserver=true -webserver_port=25010 -state_store_num_server_worker_threads=4 -kerberos_reinit_interval=60 -principal=impala/CLOUDERAVALM1@REALM.COM -keytab_file=/run/cloudera-scm-agent/process/820-impala-STATESTORE/impala.keytab -log_filename=statestored -minidump_path=/var/log/impala-minidumps -max_minidumps=9
Here the keytabs auto generated (in the directory: (/run/cloudera-scm-agent/process/...)
Catalog:
slot KVNO Principal ---- ---- --------------------------------------------------------------------- 1 1 impala/CLOUDERAVALM1@REALM.COM 2 1 impala/CLOUDERAVALM1@REALM.COM 3 1 impala/CLOUDERAVALM1@REALM.COM 4 1 impala/CLOUDERAVALM1@REALM.COM 5 1 impala/CLOUDERAVALM1@REALM.COM 6 1 impala/CLOUDERAVALM1@REALM.COM
State store:
slot KVNO Principal ---- ---- --------------------------------------------------------------------- 1 1 impala/CLOUDERAVALM1@REALM.COM 2 1 impala/CLOUDERAVALM1@REALM.COM 3 1 impala/CLOUDERAVALM1@REALM.COM 4 1 impala/CLOUDERAVALM1@REALM.COM 5 1 impala/CLOUDERAVALM1@REALM.COM 6 1 impala/CLOUDERAVALM1@REALM.COM
If necessary, we can provide also other logs.
Thanks.
Created 01-01-2017 09:16 AM
When you mention you have the same problem, what is the exact error you are getting?
As for the original issue in this post, we see two items that can cause issues for kerberos in Hadoop:
(1) hosts with no domains (even .local would do)
(2) Capital letters on hostnames.
You have this configured the hostname with all capitals: impala/CLOUDERAVALM1@REALM.COM
In order to have the best chance of getting kerberos to work, I would recommend verifying the following:
(1) All hosts have Fully-qualified domain names. For instance, "hostname" should return the hostname and "hostname -f" should return the FQDN.
(2) If relying on the hosts file for resolution, make sure that you are using the following format:
IP FQDN HOSTNAME
For example:
10.0.0.2 myhost.example.com myhost
(3) Make sure you use only uppercase host names. Hadoop is sensitive to this at the moment. Though technically valid, it will cause problems for sure.
(4) Ensure all hosts can resolve eachother with forward and reverse DNS (with FQDN).
I think the main problem you are facing is dealing with the uppercase hostnames without a domain. It'll work fine without Kerberos involved, but when intoducing Kerberos, the rules change a bit to support that method of authentication.
After you make the network changes, make sure to regenerate credentials for all roles so that the correct principals are created.
I hope this is a good start.
Regards,
Ben
Created 12-22-2016 01:46 AM
Created 01-01-2017 09:16 AM
When you mention you have the same problem, what is the exact error you are getting?
As for the original issue in this post, we see two items that can cause issues for kerberos in Hadoop:
(1) hosts with no domains (even .local would do)
(2) Capital letters on hostnames.
You have this configured the hostname with all capitals: impala/CLOUDERAVALM1@REALM.COM
In order to have the best chance of getting kerberos to work, I would recommend verifying the following:
(1) All hosts have Fully-qualified domain names. For instance, "hostname" should return the hostname and "hostname -f" should return the FQDN.
(2) If relying on the hosts file for resolution, make sure that you are using the following format:
IP FQDN HOSTNAME
For example:
10.0.0.2 myhost.example.com myhost
(3) Make sure you use only uppercase host names. Hadoop is sensitive to this at the moment. Though technically valid, it will cause problems for sure.
(4) Ensure all hosts can resolve eachother with forward and reverse DNS (with FQDN).
I think the main problem you are facing is dealing with the uppercase hostnames without a domain. It'll work fine without Kerberos involved, but when intoducing Kerberos, the rules change a bit to support that method of authentication.
After you make the network changes, make sure to regenerate credentials for all roles so that the correct principals are created.
I hope this is a good start.
Regards,
Ben
Created 01-03-2017 04:38 PM
Created 07-25-2019 10:00 AM
is a bit late but i post the solution that worked for me.
the problem was the hostnames, impala with kerberos wants the hostnames in lowercase.
Created 12-31-2016 10:29 PM