Created on 05-27-2016 03:59 PM - edited 09-16-2022 03:22 AM
Team,
This error has sucked up my entire week. I have now poured over 40 hours into troubleshooting this error and have made zero progress. I am still getting the same error messages. While not a critical tool for us I do not like things broken. So I really could use advice on how to troubleshoot this or even fix it. What else can I look at?
Versions: CentOS 6.7, Java 1.7, CDH 5.7, MIT Kerberos 5 1.10, Impala with yum download
Build: A 12 node cluster running in AWS. There is no Cloudera Manager. I have enabled HA for HDFS and YARN. I have installed Kerberos. I also have installed SSL using a Java keystore and Java truststore. These are signed with a self-signed cert. HDFS, YARN, MapReduce, Hive, Oozie, and HBase all work from the command line.
Impala fails. Impala did work well prior to installing Kerberos.
Error Message: This is the critical error message. It implies the impala-catalog is not initiating a kerberos ticket.
There are follow on messages about not reaching the metastore. But I have focused on the GSS error.
Java exception follows:
javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
IllegalStateException. MetaException. could not connect to meta store using any of hte URIs provided. Most recent failure thrift.transport.TTransportException: GSS initiate failed.
Diag: The impala deamons are able to kinit a tgt but they may not be able to requesting service tickets correctly.
I believe this is a configuration error. Some parameter is not being passed into Impala correctly. Either from Kerberos or from Impala.
Troubleshooting:
1. Impala worked prior to the installation of Kerberos. It failed immediately after installing Kerberos. I have maded lots of different configuration changes on the default impala file. No change in errors.
2. Tested DNS. Valid
3. Validated the default impala file against Cloudera cdh 5.1 manual. Valid
4. Validated the JCE install of jars. Valid and working with KRB. I moved this up to AES256 and then down again. No change in errors.
5. Set default KRB5 to desc3-cbc-sha1 for all principals. Rebuilt the KRB db. Validated all principals are using the same encrypt. No change.
6. Limited the encrypts to only desc3-cbc-sha1. Rebuilt the KRB db. Validated all principals are using the same encyrpt. Allowed weak encryption. No change.
7. Started impala-catalog by hand on the master servers and by service script. Attempt to force other errors. No change.
8. Added the following line into the hadoop-env.sh. Restarted the cluster. No change. This was really an Easter egg, but by this time I was willing to try anything.
# WKD added due to Kerberos issues related to Impala.
export JAVA_OPTS="-Djavax.security.auth.useSubjectCredsOnly=false ${JAVA_OPTS}"
9. I hunted all of the support config files for some parameter that might affect only Impala. After all of the other apps worked. In particulary HBase came up with no hestiation or follow on troubleshooing.
10. I have tried the kinit -R several times and have valdiated I am getting newable tickets.
Current output:
****IMPALA PARAMETERS
--heap_profile_dir=
--hostname=master03.invalid
--keytab_file=/etc/impala/conf/impala.keytab
--krb5_conf=/etc/krb5.conf
--krb5_debug_file=
--mem_limit=80%
--principal=impala/master03.invalid@HADOOPREALM
*****PRINCIPALS
kadmin: getprinc impala/master03.invalid@HADOOPREALM
Principal: impala/master03.invalid@HADOOPREALM
Expiration date: [never]
Last password change: Fri May 27 19:38:20 UTC 2016
Password expiration date: [none]
Maximum ticket life: 2 days 00:00:00
Maximum renewable life: 7 days 00:00:00
Last modified: Fri May 27 19:38:20 UTC 2016 (hdadmin/admin@HADOOPREALM)
Last successful authentication: [never]
Last failed authentication: [never]
Failed password attempts: 0
Number of keys: 4
Key: vno 1, des3-cbc-sha1, no salt
Key: vno 1, arcfour-hmac, no salt
Key: vno 1, des-hmac-sha1, no salt
Key: vno 1, des-cbc-md5, no salt
kadmin: getprinc krbtgt/HADOOPREALM@HADOOPREALM
Principal: krbtgt/HADOOPREALM@HADOOPREALM
Expiration date: [never]
Last password change: [never]
Password expiration date: [none]
Maximum ticket life: 2 days 00:00:00
Maximum renewable life: 7 days 00:00:00
Last modified: Fri May 27 18:45:45 UTC 2016 (db_creation@HADOOPREALM)
Last successful authentication: [never]
Last failed authentication: [never]
Failed password attempts: 0
Number of keys: 5
Key: vno 1, des3-cbc-sha1, no salt
Key: vno 1, arcfour-hmac, no salt
Key: vno 1, des-hmac-sha1, no salt
Key: vno 1, des-cbc-md5, no salt
Key: vno 1, des-cbc-crc, no salt
impala$ klist -e
Ticket cache: FILE:/tmp/krb5cc_490
Default principal: impala/master03.invalid@HADOOPREALM
Valid starting Expires Service principal
05/27/16 21:46:31 05/29/16 21:46:31 krbtgt/HADOOPREALM@HADOOPREALM
renew until 06/03/16 21:45:50, Etype (skey, tkt): des3-cbc-sha1, des3-cbc-sha1
Current Config Files:
*****IMPALA DEFAULT
IMPALA_BACKEND_PORT=22000
IMPALA_STATE_STORE_HOST=master03.invalid
IMPALA_STATE_STORE_PORT=24000
IMPALA_CATALOG_SERVICE_HOST=master03.invalid
IMPALA_CATALOG_SERVICE_PORT=26000
IMPALA_LOG_DIR=/var/log/impala
IMPALA_STATE_STORE_ARGS=" -state_store_port=${IMPALA_STATE_STORE_PORT} -kerberos_reinit_interval=60 -principal=impala/${IMPALA_STATE_STORE_HOST}@HADOOPREALM -keytab_file=/etc/impala/conf/impala.keytab -log_dir=${IMPALA_LOG_DIR} "
IMPALA_CATALOG_ARGS=" -kerberos_reinit_interval=60 -principal=impala/${IMPALA_STATE_STORE_HOST}@HADOOPREALM -keytab_file=/etc/impala/conf/impala.keytab -log_dir=${IMPALA_LOG_DIR} "
IMPALA_SERVER_ARGS=" -be_port=${IMPALA_BACKEND_PORT} -use_statestore -state_store_host=${IMPALA_STATE_STORE_HOST} -state_store_port=${IMPALA_STATE_STORE_PORT} -catalog_service_host=${IMPALA_CATALOG_SERVICE_HOST} -kerberos_reinit_interval=60 -principal=impala/master03.invalid@HADOOPREALM -keytab_file=/etc/impala/conf/impala.keytab -log_dir=${IMPALA_LOG_DIR} "
ENABLE_CORE_DUMPS=false
****KRB5.CON
[libdefaults]
default_realm = HADOOPREALM
dns_lookup_realm = false
dns_lookup_kdc = false
ticket_lifetime = 2d
renew_lifetime = 7d
forwardable = true
renewable = true
allow_weak_crypto = true
default_tgs_enctypes = des3-hmac-sha1 arcfour-hmac des-hmac-sha1 des-cbc-md5 des-cbc-crc aes128-cts aes256-cts
default_tkt_enctypes = des3-hmac-sha1 arcfour-hmac des-hmac-sha1 des-cbc-md5 des-cbc-crc aes128-cts aes256-cts
[realms]
HADOOPREALM = {
kdc = admin01.invalid
admin_server = admin01.invalid
default_domain = invalid
}
[domain_realm]
.invalid = HADOOPREALM
invalid = HADOOPREALM
****KDC.CONF
[kdcdefaults]
kdc_ports = 88,750
[realms]
HADOOPREALM = {
database_name = /var/kerberos/krb5kdc/principal
acl_file = /var/kerberos/krb5kdc/kadm5.acl
dict_file = /var/kerberos/krb5kdc/kadm5.dict
admin_keytab = /var/kerberos/krb5kdc/kadm5.keytab
key_stash_file = /var/kerberos/krb5kdc/.k5.HADOOPREALM
kadmind_port = 749
allow-tickets = true
forwardable = true
renewable = true
max_life = 2d 0h 0m 0s
max_renewable_life = 7d 0h 0m 0s
master_key_type = des3-hmac-sha1
supported_enctypes = des3-hmac-sha1:normal arcfour-hmac:normal des-hmac-sha1:normal des-cbc-md5:normal des-cbc-crc:normal aes128-cbc:normal aes256-cbc:normal
default_principal_flags = +renewable,+forwardable,+postdateable,+proxiable,+tgt-based,+service
}
Created 05-30-2016 09:04 PM
Team,
I wrote a script today to convert Java Keystore into PEM files and distributed a crt, key, and cacrt around the cluster. My hope is there might be some connection to the Kerberos error. This did not resolve the error and I still have the same error messages:
Updated /etc/default/impala file:
IMPALA_BACKEND_PORT=22000
IMPALA_STATE_STORE_HOST=master03.invalid
IMPALA_STATE_STORE_PORT=24000
IMPALA_CATALOG_SERVICE_HOST=master03.invalid
IMPALA_CATALOG_SERVICE_PORT=26000
IMPALA_LOG_DIR=/var/log/impala
IMPALA_STATE_STORE_ARGS=" -state_store_port=${IMPALA_STATE_STORE_PORT} -kerberos_reinit_interval=60 -principal=impala/${IMPALA_STATE_STORE_HOST}@HADOOPREALM -keytab_file=/etc/impala/conf/impala.keytab -ssl_server_certificate=/etc/pki/tls/certs/hadoop.crt -ssl_private_key=/etc/pki/tls/private/hadoop.key -ssl_client_ca_certificate=/etc/pki/tls/certs/hadoop.ca.crt -log_dir=${IMPALA_LOG_DIR} "
IMPALA_CATALOG_ARGS=" -kerberos_reinit_interval=60 -principal=impala/${IMPALA_STATE_STORE_HOST}@HADOOPREALM -keytab_file=/etc/impala/conf/impala.keytab -ssl_server_certificate=/etc/pki/tls/certs/hadoop.crt -ssl_private_key=/etc/pki/tls/private/hadoop.key -ssl_client_ca_certificate=/etc/pki/tls/certs/hadoop.ca.crt -log_dir=${IMPALA_LOG_DIR} "
IMPALA_SERVER_ARGS=" -be_port=${IMPALA_BACKEND_PORT} -use_statestore -state_store_host=${IMPALA_STATE_STORE_HOST} -state_store_port=${IMPALA_STATE_STORE_PORT} -catalog_service_host=${IMPALA_CATALOG_SERVICE_HOST} -kerberos_reinit_interval=60 -principal=impala/master03.invalid@HADOOPREALM -keytab_file=/etc/impala/conf/impala.keytab -ssl_server_certificate=/etc/pki/tls/certs/hadoop.crt -ssl_private_key=/etc/pki/tls/private/hadoop.key -ssl_client_ca_certificate=/etc/pki/tls/certs/hadoop.ca.crt -log_dir=${IMPALA_LOG_DIR} "
ENABLE_CORE_DUMPS=false
Created 05-31-2016 02:07 AM
Sorry for the inconvenience here. I think the errors messages should've been better to aid debugging. Can you paste the contents of the catalog-server startup till it fails to obtain tgt? Also did you try manually kinit'ing with the keytab and catalog principal and make sure it works? Whats the output of "klist -kt /etc/impala/conf/impala.keytab" ?
Created 06-02-2016 10:50 AM
Good morning,
I had time last night to troubleshoot this problem. Here is an update to the outputs and logs. There is quite a bit to give you a view of the entire impala system. I am still working on the theory that impala is not able to gain a tgt on its own and even when it has a tgt it is not receiving tgs. I spent a fair amount of time checking the general configuration settings hoping to find a permissions error or group membership error that might affect Kerberos.
We now have an interesting state. The state-store daemon turns on and stays on, the impalad on data01 turns on and stays on. See the logs. But I can not determine why the other three impalads on data02 to data04 will not turn on. See the logs. Also the impala-catalog also fails on start up. Though sometimes it will start and stay on, but has lots of Kerberos related error messages.
TROUBLESHOOTING
1. No changes have been made to the principal on the KDC.
2. No changes have been made to the keytabs. I have previously updated both the KDC by first deleting the principles and keytabs and then re-creating them.
3. I check that impala is a member of the group hadoop on all systems.
4. I check that the keytab is owned by impala:impala with permissions of 600.
5. No changes have been made to the krb5.conf or kdc.conf files.
At this point I am stuck and could use at least troubleshooting tips. I can backdown the SSL use in the impala default and I can replace the principals and keytabs again, but this seems like recovering the same ground. Your assistance will be greatly apprecated.
*****CURRENT DEFAULT IMPALA
IMPALA_BACKEND_PORT=22000
IMPALA_STATE_STORE_HOST=master03.invalid
IMPALA_STATE_STORE_PORT=24000
IMPALA_CATALOG_SERVICE_HOST=master03.invalid
IMPALA_CATALOG_SERVICE_PORT=26000
IMPALA_LOG_DIR=/var/log/impala
IMPALA_STATE_STORE_ARGS=" \
-state_store_port=${IMPALA_STATE_STORE_PORT} \
-kerberos_reinit_interval=60 \
-principal=impala/master03.invalid@HADOOPREALM \
-keytab_file=/etc/impala/conf/impala.keytab \
-ssl_server_certificate=/etc/pki/tls/certs/hadoop.crt \
-ssl_private_key=/etc/pki/tls/private/hadoop.key \
-ssl_client_ca_certificate=/etc/pki/tls/certs/hadoop.ca.crt \
-log_dir=${IMPALA_LOG_DIR} "
IMPALA_CATALOG_ARGS=" \
-kerberos_reinit_interval=60 \
-principal=impala/master03.invalid@HADOOPREALM \
-keytab_file=/etc/impala/conf/impala.keytab \
-ssl_server_certificate=/etc/pki/tls/certs/hadoop.crt \
-ssl_private_key=/etc/pki/tls/private/hadoop.key \
-ssl_client_ca_certificate=/etc/pki/tls/certs/hadoop.ca.crt \
-log_dir=${IMPALA_LOG_DIR} "
IMPALA_SERVER_ARGS=" \
-use_statestore \
-state_store_host=${IMPALA_STATE_STORE_HOST} \
-state_store_port=${IMPALA_STATE_STORE_PORT} \
-catalog_service_host=${IMPALA_CATALOG_SERVICE_HOST} \
-catalog_service_port=${IMPALA_CATALOG_SERVICE_PORT} \
-be_port=${IMPALA_BACKEND_PORT} \
-kerberos_reinit_interval=60 \
-principal=impala/data02.invalid@HADOOPREALM \
-keytab_file=/etc/impala/conf/impala.keytab \
-ssl_server_certificate=/etc/pki/tls/certs/hadoop.crt \
-ssl_private_key=/etc/pki/tls/private/hadoop.key \
-ssl_client_ca_certificate=/etc/pki/tls/certs/hadoop.ca.crt \
-log_dir=${IMPALA_LOG_DIR} "
ENABLE_CORE_DUMPS=false
******IMPALA-STATE-STORE STARTUP
hdadmin@master03:impala> sudo -u impala kdestroy
hdadmin@master03:impala> sudo service impala-state-store start
Started Impala State Store Server (statestored): [ OK ]
hdadmin@master03:impala> sudo -u impala klist
klist: No credentials cache found (ticket cache FILE:/tmp/krb5cc_490)
hdadmin@master03:impala> sudo service impala-state-store status
Impala State Store Server is running [ OK ]
hdadmin@master03:impala>
Log file created at: 2016/06/02 16:34:13
Running on machine: master03.invalid
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0602 16:34:13.122653 2033 logging.cc:120] stderr will be logged to this file.
*****IMPALA-CATALOG STARTUP
hdadmin@master03:impala> sudo -u impala klist -ket /etc/impala/conf/impala.keytab
Keytab name: FILE:/etc/impala/conf/impala.keytab
KVNO Timestamp Principal
---- ----------------- --------------------------------------------------------
1 05/27/16 19:38:21 impala/master03.invalid@HADOOPREALM (des3-cbc-sha1)
1 05/27/16 19:38:21 impala/master03.invalid@HADOOPREALM (arcfour-hmac)
1 05/27/16 19:38:21 impala/master03.invalid@HADOOPREALM (des-hmac-sha1)
1 05/27/16 19:38:21 impala/master03.invalid@HADOOPREALM (des-cbc-md5)
1 05/27/16 19:38:21 impala/master03.invalid@HADOOPREALM (des-cbc-md5)
1 05/27/16 19:38:21 HTTP/master03.invalid@HADOOPREALM (des3-cbc-sha1)
1 05/27/16 19:38:21 HTTP/master03.invalid@HADOOPREALM (arcfour-hmac)
1 05/27/16 19:38:21 HTTP/master03.invalid@HADOOPREALM (des-hmac-sha1)
1 05/27/16 19:38:21 HTTP/master03.invalid@HADOOPREALM (des-cbc-md5)
hdadmin@master03:impala> sudo service impala-catalog start
Started Impala Catalog Server (catalogd) : [ OK ]
hdadmin@master03:impala> sudo service impala-catalog status
Impala Catalog Server is running [ OK ]
hdadmin@master03:impala> sudo service impala-catalog status
Impala Catalog Server is running [ OK ]
Log file created at: 2016/06/02 16:45:37
Running on machine: master03.invalid
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0602 16:45:37.612249 2184 logging.cc:120] stderr will be logged to this file.
E0602 16:45:41.492449 2184 authentication.cc:155] SASL message (Kerberos (internal)): No worthy mechs found
E0602 16:45:42.274075 2240 CatalogServiceCatalog.java:190] Error loading cache pools:
Java exception follows:
java.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "master03.invalid/172.31.2.211"; destination host is: "master02.invalid":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1408)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy9.listCachePools(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listCachePools(ClientNamenodeProtocolTranslatorPB.java:1247)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy10.listCachePools(Unknown Source)
at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:55)
at org.apache.hadoop.hdfs.protocol.CachePoolIterator.makeRequest(CachePoolIterator.java:33)
at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequest(BatchedRemoteIterator.java:77)
at org.apache.hadoop.fs.BatchedRemoteIterator.makeRequestIfNeeded(BatchedRemoteIterator.java:85)
at org.apache.hadoop.fs.BatchedRemoteIterator.hasNext(BatchedRemoteIterator.java:99)
at com.cloudera.impala.catalog.CatalogServiceCatalog$CachePoolReader.run(CatalogServiceCatalog.java:185)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
***CATALOGD INFO LOG
Log file created at: 2016/06/02 16:45:37
Running on machine: master03.invalid
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0602 16:45:37.612102 2184 logging.cc:119] stdout will be logged to this file.
E0602 16:45:37.612249 2184 logging.cc:120] stderr will be logged to this file.
I0602 16:45:37.614852 2184 authentication.cc:678] Using internal kerberos principal "impala/master03.invalid@HADOOPREALM"
I0602 16:45:37.614866 2184 authentication.cc:1013] Internal communication is authenticated with Kerberos
I0602 16:45:37.615038 2184 authentication.cc:798] Waiting for Kerberos ticket for principal: impala/master03.invalid@HADOOPREALM
I0602 16:45:37.615051 2204 authentication.cc:494] Registering impala/master03.invalid@HADOOPREALM, keytab file /etc/impala/conf/impala.keytab
I0602 16:45:37.624145 2184 authentication.cc:800] Kerberos ticket granted to impala/master03.invalid@HADOOPREALM
I0602 16:45:37.624225 2184 authentication.cc:678] Using external kerberos principal "impala/master03.invalid@HADOOPREALM"
I0602 16:45:37.624233 2184 authentication.cc:1029] External communication is authenticated with Kerberos
I0602 16:45:37.624416 2184 init.cc:158] catalogd version 2.5.0-cdh5.7.0 RELEASE (build ad3f5adabedf56fe6bd9eea39147c067cc552703)
Built on Wed, 23 Mar 2016 11:51:12 PST
I0602 16:45:37.624425 2184 init.cc:159] Using hostname: master03.invalid
I0602 16:45:37.625203 2184 logging.cc:155] Flags (see also /varz are on debug webserver):
I0602 16:45:37.625296 2184 init.cc:166] Physical Memory: 7.17 GB
I0602 16:45:37.625303 2184 init.cc:167] OS version: Linux version 2.6.32-642.el6.x86_64 (mockbuild@worker1.bsys.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) ) #1 SMP Tue May 10 17:27:01 UTC 2016
Clock: clocksource: 'xen', clockid_t: CLOCK_MONOTONIC_COARSE
I0602 16:45:37.625308 2184 init.cc:168] Process ID: 2184
I0602 16:45:39.638878 2184 webserver.cc:216] Starting webserver on 0.0.0.0:25020
I0602 16:45:39.638901 2184 webserver.cc:230] Document root: /usr/lib/impala
I0602 16:45:39.639093 2184 webserver.cc:315] Webserver started
I0602 16:45:39.737767 2184 GlogAppender.java:123] Logging initialized. Impala: INFO, All other: INFO
I0602 16:45:39.738981 2184 JniCatalog.java:92] Java Version Info: Java(TM) SE Runtime Environment (1.7.0_79-b15)
W0602 16:45:40.689484 2184 HiveConf.java:2712] HiveConf of name hive.server.thrift.port does not exist
I0602 16:45:40.728157 2184 HiveMetaStoreClient.java:376] Trying to connect to metastore with URI thrift://client01.invalid:9083
I0602 16:45:41.253762 2184 HiveMetaStoreClient.java:473] Connected to metastore.
I0602 16:45:41.424270 2184 CatalogServiceCatalog.java:474] Loading native functions for database: default
I0602 16:45:41.424904 2184 CatalogServiceCatalog.java:500] Loading Java functions for database: default
I0602 16:45:41.472874 2184 HiveMetaStoreClient.java:502] Closed a connection to metastore, current connections: 5
I0602 16:45:41.483974 2184 statestore-subscriber.cc:179] Starting statestore subscriber
I0602 16:45:41.486539 2184 thrift-server.cc:431] ThriftServer 'StatestoreSubscriber' started on port: 23020
I0602 16:45:41.486553 2184 statestore-subscriber.cc:190] Registering with statestore
E0602 16:45:41.492449 2184 authentication.cc:155] SASL message (Kerberos (internal)): No worthy mechs found
I0602 16:45:41.500990 2184 thrift-client.cc:55] Unable to connect to localhost:24000
I0602 16:45:41.424904 2184 CatalogServiceCatalog.java:500] Loading Java functions for database: default
I0602 16:45:41.472874 2184 HiveMetaStoreClient.java:502] Closed a connection to metastore, current connections: 5
I0602 16:45:41.483974 2184 statestore-subscriber.cc:179] Starting statestore subscriber
I0602 16:45:41.486539 2184 thrift-server.cc:431] ThriftServer 'StatestoreSubscriber' started on port: 23020
I0602 16:45:41.486553 2184 statestore-subscriber.cc:190] Registering with statestore
E0602 16:45:41.492449 2184 authentication.cc:155] SASL message (Kerberos (internal)): No worthy mechs found
I0602 16:45:41.500990 2184 thrift-client.cc:55] Unable to connect to localhost:24000
I0602 16:45:41.501003 2184 thrift-client.cc:61] (Attempt 1 of 10)
W0602 16:45:42.190608 2240 UserGroupInformation.java:1696] PriviledgedActionException as:impala (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
W0602 16:45:42.193403 2240 Client.java:682] Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
W0602 16:45:42.193591 2240 UserGroupInformation.java:1696] PriviledgedActionException as:impala (auth:KERBEROS) cause:java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
W0602 16:45:42.201658 2240 UserGroupInformation.java:1696] PriviledgedActionException as:impala (auth:KERBEROS) cause:javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
W0602 16:45:42.202028 2240 Client.java:682] Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
ava.io.IOException: Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "master03.invalid/172.31.2.211"; destination host is: "master02.invalid":8020;
at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:772)
at org.apache.hadoop.ipc.Client.call(Client.java:1475)
at org.apache.hadoop.ipc.Client.call(Client.java:1408)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:230)
at com.sun.proxy.$Proxy9.listCachePools(Unknown Source)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.listCachePools(ClientNamenodeProtocolTranslatorPB.java:1247)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy10.listCachePools(Unknown Source)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:304)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:687)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:650)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:650)
at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:738)
at org.apache.hadoop.ipc.Client$Connection.access$2900(Client.java:375)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1524)
at org.apache.hadoop.ipc.Client.call(Client.java:1447)
... 24 more
****IMPALAD ON DATA01
ta01:impala> sudo -u impala kdestroy
hdadmin@data01:impala> sudo service impala-server start
Started Impala Server (impalad): [ OK ]
hdadmin@data01:impala>
hdadmin@data01:impala> sudo service impala-server status
Impala Server is running [ OK ]
hdadmin@data01:impala> sudo -u impala klist
klist: No credentials cache found (ticket cache FILE:/tmp/krb5cc_490)
Log file created at: 2016/06/02 16:50:51
Running on machine: data01.invalid
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0602 16:50:51.552574 1560 logging.cc:120] stderr will be logged to this file.
E0602 16:50:56.473870 1560 authentication.cc:155] SASL message (Kerberos (internal)): No worthy mechs found
Log file created at: 2016/06/02 16:50:51
Running on machine: data01.invalid
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
I0602 16:50:51.552439 1560 logging.cc:119] stdout will be logged to this file.
E0602 16:50:51.552574 1560 logging.cc:120] stderr will be logged to this file.
I0602 16:50:51.560744 1560 authentication.cc:678] Using internal kerberos principal "impala/data01.invalid@HADOOPREALM"
I0602 16:50:51.560756 1560 authentication.cc:1013] Internal communication is authenticated with Kerberos
I0602 16:50:51.560888 1560 authentication.cc:798] Waiting for Kerberos ticket for principal: impala/data01.invalid@HADOOPREALM
I0602 16:50:51.560904 1580 authentication.cc:494] Registering impala/data01.invalid@HADOOPREALM, keytab file /etc/impala/conf/impala.keytab
I0602 16:50:51.580030 1560 authentication.cc:800] Kerberos ticket granted to impala/data01.invalid@HADOOPREALM
I0602 16:50:51.580108 1560 authentication.cc:678] Using external kerberos principal "impala/data01.invalid@HADOOPREALM"
I0602 16:50:51.580116 1560 authentication.cc:1029] External communication is authenticated with Kerberos
I0602 16:50:51.580291 1560 init.cc:158] impalad version 2.5.0-cdh5.7.0 RELEASE (build ad3f5adabedf56fe6bd9eea39147c067cc552703)
I0602 16:50:51.580291 1560 init.cc:158] impalad version 2.5.0-cdh5.7.0 RELEASE (build ad3f5adabedf56fe6bd9eea39147c067cc552703)
Built on Wed, 23 Mar 2016 11:51:12 PST
I0602 16:50:51.580298 1560 init.cc:159] Using hostname: data01.invalid
I0602 16:50:51.560904 1580 authentication.cc:494] Registering impala/data01.invalid@HADOOPREALM, keytab file /etc/impala/conf/impala.keytab
I0602 16:50:51.580030 1560 authentication.cc:800] Kerberos ticket granted to impala/data01.invalid@HADOOPREALM
I0602 16:50:51.580108 1560 authentication.cc:678] Using external kerberos principal "impala/data01.invalid@HADOOPREALM"
I0602 16:50:51.580116 1560 authentication.cc:1029] External communication is authenticated with Kerberos
I0602 16:50:51.580291 1560 init.cc:158] impalad version 2.5.0-cdh5.7.0 RELEASE (build ad3f5adabedf56fe6bd9eea39147c067cc552703)
Built on Wed, 23 Mar 2016 11:51:12 PST
I0602 16:50:51.580298 1560 init.cc:159] Using hostname: data01.invalid
I0602 16:50:51.581035 1560 logging.cc:155] Flags (see also /varz are on debug webserver):
--catalog_service_port=26000
--load_catalog_in_background=false
--num_metadata_loading_threads=16
--num_metadata_loading_threads=16
--sentry_config=
--disable_optimization_passes=false
--dump_ir=false
--opt_module_dir=
--print_llvm_ir_instruction_count=false
--unopt_module_dir=
--abort_on_config_error=true
--be_port=22000
--be_principal=
--compact_catalog_topic=false
--disable_mem_pools=false
--enable_process_lifetime_heap_profiling=false
--heap_profile_dir=
--hostname=data01.invalid
--keytab_file=/etc/impala/conf/impala.keytab
--krb5_conf=
--krb5_debug_file=
--mem_limit=80%
--principal=impala/data01.invalid@HADOOPREALM
--redaction_rules_file=
--max_log_files=10
--log_filename=impalad
--redirect_stdout_stderr=true
--data_source_batch_size=1024
--exchg_node_buffer_size_bytes=10485760
--enable_partitioned_aggregation=true
--enable_partitioned_hash_join=true
--enable_probe_side_filtering=true
--skip_lzo_version_check=false
--convert_legacy_hive_parquet_utc_timestamps=false
--max_page_header_size=8388608
--parquet_min_filter_reject_ratio=0.10000000000000001
--max_row_batches=0
--runtime_filter_wait_time_ms=1000
--suppress_unknown_disk_id_warnings=false
--enable_phj_probe_side_filtering=true
--enable_ldap_auth=false
--internal_principals_whitelist=hdfs
--kerberos_reinit_interval=60
--ldap_allow_anonymous_binds=false
--ldap_baseDN=
--ldap_bind_pattern=
--ldap_ca_certificate=
--ldap_domain=
--ldap_manual_config=false
--ldap_passwords_in_clear_ok=false
--ldap_tls=false
--ldap_uri=
--sasl_path=
--rpc_cnxn_attempts=10
--rpc_cnxn_retry_interval_ms=2000
--disk_spill_encryption=false
--insert_inherit_permissions=false
--datastream_sender_timeout_ms=120000
--max_cached_file_handles=0
--max_free_io_buffers=128
--min_buffer_size=1024
--num_disks=0
--num_remote_hdfs_io_threads=8
--num_s3_io_threads=16
--num_threads_per_disk=0
--read_size=8388608
--resource_broker_cnxn_attempts=1
--resource_broker_cnxn_retry_interval_ms=3000
--resource_broker_recv_timeout=0
--resource_broker_send_timeout=0
--staging_cgroup=impala_staging
--state_store_host=master03.invalid
--state_store_subscriber_port=23000
--use_statestore=true
--local_library_dir=/tmp
--serialize_batch=false
--status_report_interval=5
--max_filter_error_rate=0.75
--num_threads_per_core=3
--use_local_tz_for_unix_timestamp_conversions=false
--scratch_dirs=/tmp
--queue_wait_timeout_ms=60000
--max_vcore_oversubscription_ratio=2.5
--rm_mem_expansion_timeout_ms=5000
--rm_always_use_defaults=false
--rm_default_cpu_vcores=2
--rm_default_memory=4G
--default_pool_max_queued=200
--default_pool_max_requests=-1
--default_pool_mem_limit=
--disable_pool_max_requests=false
--disable_pool_mem_limits=false
--fair_scheduler_allocation_path=
--llama_site_path=
--require_username=false
--disable_admission_control=true
--log_mem_usage_interval=0
--authorization_policy_file=
--lineage_event_log_dir=
--local_nodemanager_url=
--log_query_to_file=true
--max_audit_event_log_file_size=5000
--max_lineage_log_file_size=5000
--max_profile_log_file_size=5000
--max_result_cache_size=100000
--profile_log_dir=
--query_log_size=25
--ssl_client_ca_certificate=/etc/pki/tls/certs/hadoop.ca.crt
--ssl_private_key=/etc/pki/tls/private/hadoop.key
--ssl_private_key_password_cmd=
--ssl_server_certificate=/etc/pki/tls/certs/hadoop.crt
--statestore_subscriber_cnxn_attempts=10
--statestore_subscriber_cnxn_retry_interval_ms=3000
--statestore_subscriber_timeout_seconds=30
--state_store_port=24000
--statestore_heartbeat_frequency_ms=1000
--statestore_heartbeat_tcp_timeout_seconds=3
--statestore_max_missed_heartbeats=10
--statestore_num_heartbeat_threads=10
--statestore_num_heartbeat_threads=10
--statestore_num_update_threads=10
--statestore_update_frequency_ms=2000
--statestore_update_tcp_timeout_seconds=300
--force_lowercase_usernames=false
--num_cores=0
--web_log_bytes=1048576
--non_impala_java_vlog=0
--periodic_counter_update_period_ms=500
--enable_webserver_doc_root=true
--webserver_authentication_domain=
--webserver_certificate_file=
I0602 16:50:51.581122 1560 init.cc:166] Physical Memory: 7.17 GB
I0602 16:50:51.581128 1560 init.cc:167] OS version: Linux version 2.6.32-642.el6.x86_64 (mockbuild@worker1.bsys.centos.org) (gcc version 4.4.7 20120313 (Red Hat 4.4.7-17) (GCC) ) #1 SMP Tue May 10 17:27:01 UTC 2016
Clock: clocksource: 'xen', clockid_t: CLOCK_MONOTONIC_COARSE
I0602 16:50:51.581133 1560 init.cc:168] Process ID: 1560
I0602 16:50:54.811892 1560 hbase-table-scanner.cc:157] Detected HBase version >= 0.95.2
I0602 16:50:54.863241 1560 GlogAppender.java:123] Logging initialized. Impala: INFO, All other: INFO
I0602 16:50:54.867651 1560 JniFrontend.java:124] Authorization is 'DISABLED'.
I0602 16:50:54.867790 1560 JniFrontend.java:126] Java Version Info: Java(TM) SE Runtime Environment (1.7.0_79-b15)
W0602 16:50:55.543539 1560 HiveConf.java:2712] HiveConf of name hive.server.thrift.port does not exist
I0602 16:50:56.045900 1560 simple-scheduler.cc:90] Admission control is disabled.
I0602 16:50:56.048604 1560 impala-server.cc:1157] Default query options:TQueryOptions {
01: abort_on_error (bool) = false,
02: max_errors (i32) = 0,
03: disable_codegen (bool) = false,
04: batch_size (i32) = 0,
05: num_nodes (i32) = 0,
06: max_scan_range_length (i64) = 0,
07: num_scanner_threads (i32) = 0,
08: max_io_buffers (i32) = 0,
09: allow_unsupported_formats (bool) = false,
10: default_order_by_limit (i64) = -1,
11: debug_action (string) = "",
12: mem_limit (i64) = 0,
13: abort_on_default_limit_exceeded (bool) = false,
15: hbase_caching (i32) = 0,
16: hbase_cache_blocks (bool) = false,
0602 16:50:56.414800 1560 tmp-file-mgr.cc:106] Using scratch directory /tmp/impala-scratch on disk 0
I0602 16:50:56.415395 1560 simple-logger.cc:76] Logging to: /var/log/impala/profiles//impala_profile_log_1.1-1464886256414
I0602 16:50:56.415550 1560 impala-server.cc:496] Event logging is disabled
I0602 16:50:56.415560 1560 impala-server.cc:404] Lineage logging is disabled
I0602 16:50:56.445484 1560 impala-server.cc:1838] Enabling SSL for Beeswax
I0602 16:50:56.446451 1560 impala-server.cc:1843] Impala Beeswax Service listening on 21000
I0602 16:50:56.447387 1560 impala-server.cc:1860] Enabling SSL for HiveServer2
I0602 16:50:56.447402 1560 impala-server.cc:1865] Impala HiveServer2 Service listening on 21050
I0602 16:50:56.448995 1560 impala-server.cc:1880] Enabling SSL for backend
I0602 16:50:56.449009 1560 impala-server.cc:1885] ImpalaInternalService listening on 22000
I0602 16:50:56.452577 1560 thrift-server.cc:431] ThriftServer 'backend' started on port: 22000s
I0602 16:50:56.452587 1560 exec-env.cc:309] Starting global services
I0602 16:50:56.467747 1560 exec-env.cc:396] Using global memory limit: 5.73 GB
I0602 16:50:56.470768 1560 webserver.cc:216] Starting webserver on 0.0.0.0:25000
I0602 16:50:56.470779 1560 webserver.cc:230] Document root: /usr/lib/impala
I0602 16:50:56.470927 1560 webserver.cc:315] Webserver started
I0602 16:50:56.470942 1560 simple-scheduler.cc:170] Starting simple scheduler
I0602 16:50:56.471078 1560 simple-scheduler.cc:218] Simple-scheduler using 172.31.9.163 as IP address
I0602 16:50:56.471093 1560 statestore-subscriber.cc:179] Starting statestore subscriber
I0602 16:50:56.472308 1560 thrift-server.cc:431] ThriftServer 'StatestoreSubscriber' started on port: 23000
I0602 16:50:56.472317 1560 statestore-subscriber.cc:190] Registering with statestore
E0602 16:50:56.473870 1560 authentication.cc:155] SASL message (Kerberos (internal)): No worthy mechs found
I0602 16:50:56.481997 1560 thrift-client.cc:55] Unable to connect to master03.invalid:24000
I0602 16:50:56.482007 1560 thrift-client.cc:61] (Attempt 1 of 10)
******IMPALAD START ON DATA02
hdadmin@data01:impala> sudo -u impala kdestroy
hdadmin@data01:impala> sudo service impala-server start
Started Impala Server (impalad): [ OK ]
hdadmin@data01:impala> sudo -u impala klist
klist: No credentials cache found (ticket cache FILE:/tmp/krb5cc_490)
hdadmin@data02:impala> sudo service impala-server status
Impala Server is dead and pid file exists [FAILED]
Log file created at: 2016/06/02 16:55:16
Running on machine: data02.invalid
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0602 16:55:16.228456 1676 logging.cc:120] stderr will be logged to this file.
E0602 16:55:20.523982 1676 impala-server.cc:247] Could not read the HDFS root directory at hdfs://mycluster. Error was:
Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "data02.invalid/172.31.9.164"; destination host is: "master02.invalid":8020;
E0602 16:55:20.524080 1676 impala-server.cc:249] Aborting Impala Server startup due to improper configuration
*****IMPALAD START ON DATA03
followed the same procedure. Produced this error:
Log file created at: 2016/06/02 16:58:35
Running on machine: data03.invalid
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0602 16:58:35.087704 1527 logging.cc:120] stderr will be logged to this file.
E0602 16:58:40.240041 1527 impala-server.cc:247] Could not read the HDFS root directory at hdfs://mycluster. Error was:
Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "data03.invalid/172.31.9.165"; destination host is: "master02.invalid":8020;
E0602 16:58:40.240118 1527 impala-server.cc:249] Aborting Impala Server startup due to improper configuration
Log file created at: 2016/06/02 17:01:37
Running on machine: data04.invalid
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0602 17:01:37.987602 1581 logging.cc:120] stderr will be logged to this file.
E0602 17:01:42.788092 1581 impala-server.cc:247] Could not read the HDFS root directory at hdfs://mycluster. Error was:
Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "data04.invalid/172.31.9.166"; destination host is: "master02.invalid":8020;
E0602 17:01:42.788166 1581 impala-server.cc:249] Aborting Impala Server startup due to improper configuration
'*****KINIT FROM THE COMMAND LINE AND START AGAIN ON DATA04
hdadmin@data04:impala> sudo -u impala kinit -k -t /etc/impala/conf/impala.keytab impala/data04.invalid@HADOOPREALM
hdadmin@data04:impala> sudo service impala-server start
Started Impala Server (impalad): [ OK ]
hdadmin@data04:impala> sudo service impala-server status
Impala Server is dead and pid file exists [FAILED]
hdadmin@data04:impala> sudo -u impala klist
Ticket cache: FILE:/tmp/krb5cc_490
Default principal: impala/data04.invalid@HADOOPREALM
Valid starting Expires Service principal
06/02/16 17:04:31 06/04/16 17:04:31 krbtgt/HADOOPREALM@HADOOPREALM
renew until 06/09/16 17:04:31
hdadmin@data04:impala>
Log file created at: 2016/06/02 17:04:56
Running on machine: data04.invalid
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0602 17:04:56.816195 1737 logging.cc:120] stderr will be logged to this file.
E0602 17:05:00.919498 1737 impala-server.cc:247] Could not read the HDFS root directory at hdfs://mycluster. Error was:
Failed on local exception: java.io.IOException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]; Host Details : local host is: "data04.invalid/172.31.9.166"; destination host is: "master02.invalid":8020;
E0602 17:05:00.919575 1737 impala-server.cc:249] Aborting Impala Server startup due to improper configuration
****TRYING WITH RENEW TICKET ON DATA02
hdadmin@data02:~> sudo -u impala kinit -k -t /etc/impala/conf/impala.keytab impala/data02.invalid@HADOOPREALM
hdadmin@data02:~> sudo -u impala klist
Ticket cache: FILE:/tmp/krb5cc_490
Default principal: impala/data02.invalid@HADOOPREALM
Valid starting Expires Service principal
06/02/16 17:14:13 06/04/16 17:14:13 krbtgt/HADOOPREALM@HADOOPREALM
renew until 06/09/16 17:14:13
hdadmin@data02:~>
hdadmin@data02:~> sudo service impala-server start
Started Impala Server (impalad): [ OK ]
hdadmin@data02:~>
hdadmin@data02:~> sudo service impala-server status
Impala Server is dead and pid file exists [FAILED]
hdadmin@data02:~> sudo -u impala kinit -R -k -t /etc/impala/conf/impala.keytab impala/data02.invalid@HADOOPREALM
hdadmin@data02:~> sudo service impala-server startStarted Impala Server (impalad): [ OK ]
hdadmin@data02:~> sudo service impala-server statusImpala Server is dead and pid file exists [FAILED]
****CURRENT OVERALL STATE
hdadmin@admin01:~> sudo hdinit impala status
master03.invalid: Impala State Store Server is running[ OK ]
master03.invalid: Impala Catalog Server is running[ OK ]
data01.invalid: Impala Server is running[ OK ]
data02.invalid: Impala Server is dead and pid file exists[FAILED]
data03.invalid: Impala Server is dead and pid file exists[FAILED]
data04.invalid: Impala Server is dead and pid file exists[FAILED]
Created 06-02-2016 11:17 AM
One more thing. I also checked DNS lookups and reverse lookups. All servers have DNS working. However, this is a local DNS bind name server that I setup within AWS. I also have an /etc/hosts file that has all IP and server hostnames. So the error is not a failure of Kerberos to look up hosts.
Created 06-08-2016 10:19 PM
Can you please check the kvno. of the principals in the failing hosts match with the kvno. of the principal in the KDC? Also, do you see any difference in the output of "klist -kt <path_to_impala.keytab>" on working and non-working hosts? especially in the KVNO. section? In the above pasted output, I only see it for non-working hosts where its 1, Is it the same for working hosts too?
Created 06-10-2016 01:20 PM
Bharathv,
Thank you for the follow up. I am ready to proceed on your next suggestion.
WKD
Response to Bharathv suggesting this might be a KVNO problem. A KVNO is a Kerberos version number for each key. If they increment the keys will not match. This report shows all keys are on version 1. I replaced all principals and all keys during my last major round of troubleshooting.
KDC
kadmin: getprinc impala/master03.invalid@HADOOPREALM
Principal: impala/master03.invalid@HADOOPREALM
Expiration date: [never]
Last password change: Fri May 27 19:38:20 UTC 2016
Password expiration date: [none]
Maximum ticket life: 2 days 00:00:00
Maximum renewable life: 7 days 00:00:00
Last modified: Fri May 27 19:38:20 UTC 2016 (hdadmin/admin@HADOOPREALM)
Last successful authentication: [never]
Last failed authentication: [never]
Failed password attempts: 0
Number of keys: 4
Key: vno 1, des3-cbc-sha1, no salt
Key: vno 1, arcfour-hmac, no salt
Key: vno 1, des-hmac-sha1, no salt
Key: vno 1, des-cbc-md5, no salt
MKey: vno 1
Attributes:
Policy: [none]
kadmin: getprinc impala/data03.invalid@HADOOPREALM
Principal: impala/data03.invalid@HADOOPREALM
Expiration date: [never]
Last password change: Fri May 27 19:38:20 UTC 2016
Password expiration date: [none]
Maximum ticket life: 2 days 00:00:00
Maximum renewable life: 7 days 00:00:00
Last modified: Fri May 27 19:38:20 UTC 2016 (hdadmin/admin@HADOOPREALM)
Last successful authentication: [never]
Last failed authentication: [never]
Failed password attempts: 0
Number of keys: 4
Key: vno 1, des3-cbc-sha1, no salt
Key: vno 1, arcfour-hmac, no salt
Key: vno 1, des-hmac-sha1, no salt
Key: vno 1, des-cbc-md5, no salt
MKey: vno 1
Attributes:
Policy: [none]
MASTERO3
hdadmin@admin01:~> ssh master03
Last login: Sat Jun 4 19:52:24 2016 from 172.31.2.208
hdadmin@master03:~> sudo -u impala klist -kt /etc/impala/conf/impala.keytab
Keytab name: FILE:/etc/impala/conf/impala.keytab
KVNO Timestamp Principal
---- ----------------- --------------------------------------------------------
1 05/27/16 19:38:21 impala/master03.invalid@HADOOPREALM
1 05/27/16 19:38:21 impala/master03.invalid@HADOOPREALM
1 05/27/16 19:38:21 impala/master03.invalid@HADOOPREALM
1 05/27/16 19:38:21 impala/master03.invalid@HADOOPREALM
1 05/27/16 19:38:21 HTTP/master03.invalid@HADOOPREALM
1 05/27/16 19:38:21 HTTP/master03.invalid@HADOOPREALM
1 05/27/16 19:38:21 HTTP/master03.invalid@HADOOPREALM
1 05/27/16 19:38:21 HTTP/master03.invalid@HADOOPREALM
DATA01
hdadmin@data01:~> sudo -u impala klist -kt /etc/impala/conf/impala.keytab
Keytab name: FILE:/etc/impala/conf/impala.keytab
KVNO Timestamp Principal
---- ----------------- --------------------------------------------------------
1 05/27/16 19:38:22 impala/data01.invalid@HADOOPREALM
1 05/27/16 19:38:22 impala/data01.invalid@HADOOPREALM
1 05/27/16 19:38:22 impala/data01.invalid@HADOOPREALM
1 05/27/16 19:38:22 impala/data01.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data01.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data01.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data01.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data01.invalid@HADOOPREALM
DATA02
hdadmin@data02:~> sudo -u impala klist -kt /etc/impala/conf/impala.keytab
Keytab name: FILE:/etc/impala/conf/impala.keytab
KVNO Timestamp Principal
---- ----------------- --------------------------------------------------------
1 05/27/16 19:38:22 impala/data02.invalid@HADOOPREALM
1 05/27/16 19:38:22 impala/data02.invalid@HADOOPREALM
1 05/27/16 19:38:22 impala/data02.invalid@HADOOPREALM
1 05/27/16 19:38:22 impala/data02.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data02.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data02.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data02.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data02.invalid@HADOOPREALM
DATA03
hdadmin@data03:~> sudo -u impala klist -kt /etc/impala/conf/impala.keytab
Keytab name: FILE:/etc/impala/conf/impala.keytab
KVNO Timestamp Principal
---- ----------------- --------------------------------------------------------
1 05/27/16 19:38:22 impala/data03.invalid@HADOOPREALM
1 05/27/16 19:38:22 impala/data03.invalid@HADOOPREALM
1 05/27/16 19:38:22 impala/data03.invalid@HADOOPREALM
1 05/27/16 19:38:22 impala/data03.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data03.invalid@HADOOPREALM
1 05/27/16 19:38:22 HTTP/data03.invalid@HADOOPREALM
1 05/27/16 19:38:23 HTTP/data03.invalid@HADOOPREALM
1 05/27/16 19:38:23 HTTP/data03.invalid@HADOOPREALM
DATA04
hdadmin@data04:~> sudo -u impala klist -kt /etc/impala/conf/impala.keytab
Keytab name: FILE:/etc/impala/conf/impala.keytab
KVNO Timestamp Principal
---- ----------------- --------------------------------------------------------
1 05/27/16 19:38:23 impala/data04.invalid@HADOOPREALM
1 05/27/16 19:38:23 impala/data04.invalid@HADOOPREALM
1 05/27/16 19:38:23 impala/data04.invalid@HADOOPREALM
1 05/27/16 19:38:23 impala/data04.invalid@HADOOPREALM
1 05/27/16 19:38:23 HTTP/data04.invalid@HADOOPREALM
1 05/27/16 19:38:23 HTTP/data04.invalid@HADOOPREALM
1 05/27/16 19:38:23 HTTP/data04.invalid@HADOOPREALM
1 05/27/16 19:38:23 HTTP/data04.invalid@HADOOPREALM
Created 06-13-2016 12:59 AM
Thanks for checking. The kvno.s and principals look fine to me.
- Can you confirm that the OS and kerberos client libs are same on all these nodes or are they different? (lsb_release -a, rpm -qa | grep krb...)
- Are you able to run any other services on 02 and 04 like datanode etc? Is the issue only with Impala?
Created 06-14-2016 07:48 AM
Hi,
All of these servers are built from a standardized AWS image, thus all software was installed once from a yum repo for Cloudera. Everything else in my cluster works, HDFS, YARN, Hive, Pig, Sentry, etc. All services are using Kerberos and SSL/TLS. I have an install script which creates all of the principals and keytabs at the same time. I have checked things such as users, permissions, configuration files several times and so far all is consistent.
I believe even though the impalad daemon is starting on data01 it is still not working correctly. While it does have a tgt it is not listing a tgs.
How do I start up Impala to run at debug level? I could get you a much more detailed trace.
WKD
Created 02-08-2017 05:25 PM
Same issue for me. But able to fix it by reinstalling krb5 workstation rpm.
In other words, once the CDH setup is done, uninstalled and installed krb5 workstation repo.