Reply
Explorer
Posts: 13
Registered: ‎05-03-2018

Authentication token expired at impala coordinators after Kudu restart

[ Edited ]

Hi,

 

The issue (see log below) causes to be unable to launch impala query on Kudu tables through coordinator (but launch through executors work fine).

Appeared after Kudu (masters and tservers) was restarted.

CDH 5.14.2.

LDAP authentication is enabled for impala coordinator nodes only.

 

Let me know if any additional information is needed.

Any ideas how to avoid the issue?

 

Message in impalad log at impala coordinator daemon:

Tuple(id=0 size=49 slots=[Slot(id=0 type=STRING col_path=[] offset=0 null=(offset=48 mask=1) slot_idx=0 field_idx=-1), Slot(id=1 type=STRING col_path=[] offset=16 null=(offset=48 mask=2) slot_idx=1 field_idx=-1), Slot(id=2 type=STRING col_path=[] offset=32 null=(offset=48 mask=4) slot_idx=2 field_idx=-1)] tuple_path=[])
I0503 09:29:01.486366 41053 coordinator.cc:370] started execution on 1 backends for query 964f6d68c729a5cd:2ad2ba7200000000
I0503 09:29:01.486568 42404 query-state.cc:377] Executing instance. instance_id=964f6d68c729a5cd:2ad2ba7200000000 fragment_idx=0 per_fragment_instance_idx=0 coord_state_idx=0 #in-flight=2
I0503 09:29:01.487128 42403 query-exec-mgr.cc:149] ReleaseQueryState(): query_id=964f6d68c729a5cd:2ad2ba7200000000 refcnt=3
I0503 09:29:01.487524 41053 impala-hs2-server.cc:492] ExecuteStatement(): return_val=TExecuteStatementResp {
01: status (struct) = TStatus {
01: statusCode (i32) = 0,
},
02: operationHandle (struct) = TOperationHandle {
01: operationId (struct) = THandleIdentifier {
01: guid (string) = "<some value>",
02: secret (string) = "<some value>",
},
02: operationType (i32) = 0,
03: hasResultSet (bool) = false,
},
}
I0503 09:29:01.487515 42406 coordinator.cc:789] Coordinator waiting for backends to finish, 1 remaining
I0503 09:29:01.491019 42404 client-internal.cc:281] Determining the new leader Master and retrying...
I0503 09:29:01.507668 42404 client-internal.cc:283] Unable to determine the new leader Master: Not authorized: Client connection negotiation failed: client connection to <Kudu_leader_IP>:7051: FATAL_INVALID_AUTHENTICATION_TOKEN: Not authorized: authentication token expired
I0503 09:29:01.528237 42404 client-internal.cc:283] Unable to determine the new leader Master: Not authorized: Client connection negotiation failed: client connection to <Kudu_leader_IP>:7051: FATAL_INVALID_AUTHENTICATION_TOKEN: Not authorized: authentication token expired
...

 

 

 

Cloudera Employee
Posts: 12
Registered: ‎02-22-2017

Re: Authentication token expired at impala coordinators after Kudu restart

Hi Andreyeff,

 

'FATAL_INVALID_AUTHENTICATION_TOKEN: Not authorized: authentication token expired' error indicates this is authn token expiration issue. However, in CDH5.14.2, Kudu client should be able to automatically re-acquires authn token when needed.  Do you know which Kudu client version you are using (CDH5.14.2)?  When you launch the impala query, does the coordinator have primary credentials (i.e. Kerberos)?

 

Best,

Hao

Explorer
Posts: 13
Registered: ‎05-03-2018

Re: Authentication token expired at impala coordinators after Kudu restart

[ Edited ]

Answering your questions:

1) kudu -version
kudu 1.6.0-cdh5.14.2

However, not sure if Kudu client is used in communication between Impala and Kudu.

2) Impala coordinators use LDAP authentication, impala executors - don't.

Kerberos is not used.

Moreover, the new session is initiated and the same issue appears (this happens till Impala is restarted).

 

The sequence was:

Normally the cluster works, Impala-users connect and execute queries through coordinator.

There was Kudu leader change from one master to another (possibly also masterA->masterC->masterA).

The new Impala-user runs a query and receives this issue.

Impala restart has removed the effect and cluster returned to usual operating mode.

Cloudera Employee
Posts: 52
Registered: ‎04-08-2014

Re: Authentication token expired at impala coordinators after Kudu restart

Is it possible that Impala was not restarted after a previous upgrade?

 

Please report the issue if this happens again on CDH 5.14+

 

Thanks,

Mike

Explorer
Posts: 13
Registered: ‎05-03-2018

Re: Authentication token expired at impala coordinators after Kudu restart

The cluster was stopped during upgrade (from 5.13.0), so impala (as well as other services) is expected to be started with new configs since then.

 

From cluster inspect the only thing remained running from 5.13 is supervisord process.

 

I'll share more details if it appears again.

Cloudera Employee
Posts: 52
Registered: ‎04-08-2014

Re: Authentication token expired at impala coordinators after Kudu restart

Please do. Thanks in advance for that.

Explorer
Posts: 13
Registered: ‎05-03-2018

Re: Authentication token expired at impala coordinators after Kudu restart

[ Edited ]

The issue appeared again:

this time it was at nodes, which are both "coordinator only" and "coordinator+executor".

Coordinator+executor didn't have LDAP authentication.

 

Recent changes before the issue was noticed: there was added one new impala-coordinator-only instance.

 

One more observation:

  • impala external tables (on top of Kudu) continue working fine
  • impala tables stored as Kudu were failing with the warning (in impala CLI):
    WARNINGS: Unable to open Kudu table: Timed out: GetTableSchema timed out after deadline expired

 

 

Kudu-master log file:

 

negotiation.cc:320] Unauthorized connection attempt: Server connection negotiation failed: server connection from <impala-coord-executor-IP-address>:48648: authentication token expired
...

 

Impala-coord+executor log file:

I0801 13:46:10.203364 39650 coordinator.cc:370] started execution on 2 backends for query 21448fac89ef4281:3bffcd4800000000
I0801 13:46:10.238082 39926 query-exec-mgr.cc:149] ReleaseQueryState(): query_id=21448fac89ef4281:3bffcd4800000000 refcnt=4
I0801 13:46:10.310528 39928 client-internal.cc:281] Determining the new leader Master and retrying...
I0801 13:46:10.347209 39928 client-internal.cc:283] Unable to determine the new leader Master: Not authorized: Client connection negotiation failed: client connection to <kudu-leader-IP>:7051: FATAL_INVALID_AUTHENTICATION_TOKEN: Not authorized: authentication token expired
I0801 13:46:10.374269 39928 client-internal.cc:283] Unable to determine the new leader Master: Not authorized: Client connection negotiation failed: client connection to <kudu-leader-IP>:7051: FATAL_INVALID_AUTHENTICATION_TOKEN: Not authorized: authentication token expired

 

Restart of a single impala coordinator+executor instance (where the issue happened) didn't help.

Restart of all impala daemon instances resolved the issue.

Cloudera Employee
Posts: 52
Registered: ‎04-08-2014

Re: Authentication token expired at impala coordinators after Kudu restart

@Andreyeff please remind us what version you are running at this time? Do you see anything related in the Catalog Daemon log?

Cloudera Employee
Posts: 12
Registered: ‎03-16-2017

Re: Authentication token expired at impala coordinators after Kudu restart

Hi,

 

Thank you for reporting an issue.

 

A couple of questions to clarify before starting troubleshooting the issue:

  1. How many Kudu masters do you have in your cluster?
  2. Do Kudu tables in Impala have all Kudu masters specified in their 'kudu.master_addresses' property?
  3. In the Impala's logs, do you see anything like 'Reconnecting to the cluster for a new authn token'?  If yes, what are the log lines right after that?
  4. In the Impala's logs, do you see anything like 'Unable to reconnect to the cluster for a new authn token'?  If yes, what are the error details that come along with that message?

 

 

Explorer
Posts: 13
Registered: ‎05-03-2018

Re: Authentication token expired at impala coordinators after Kudu restart

[ Edited ]

Thanks for fast replies and noting about version - forgot to mention it:

kudu 1.6.0-cdh5.14.4

Impala Shell v2.11.0-cdh5.14.4

 

mpercy,

1. Here is impala-shell (I have saved a piece of failed queries with a timestamp, so I'll try to provide details basing on it):

[<coord+executor_dns>>:21000] > select * from enabled_vars where pid="3020";
Query: select * from enabled_vars where pid="3020"
Query submitted at: 2018-08-01 13:47:35 (Coordinator: http://<coord+executor_dns>>:25000)
Query progress can be monitored at: http://<coord+executor_dns>>:25000/query_plan?query_id=2a43f7388ed44c8d:92ff202f00000000
WARNINGS: Unable to open Kudu table: Timed out: GetTableSchema timed out after deadline expired


[<coord+executor_dns>>:21000] > invalidate metadata;
Query: invalidate metadata
Query submitted at: 2018-08-01 13:48:45 (Coordinator: http://<coord+executor_dns>>:25000)
Query progress can be monitored at: http://<coord+executor_dns>>:25000/query_plan?query_id=e34f104534f46bd9:a22d22e200000000
Fetched 0 row(s) in 42.06s

2. Nothing much in catalog daemon log at that time:

I0801 13:46:19.166558 58677 catalog-server.cc:241] Catalog Version: 82226 Last Catalog Version: 82226
I0801 13:48:42.336992 47009 HdfsTable.java:1197] Incrementally loading table metadata for: <some_table>

 

 

Alexey1c,

1. 3 kudu masters

2. yes, all 3

show create table enabled_vars;
Query: show create table enabled_vars
+------------------------------------------------------------------------------------------------------------------------------------------------+
| result                                                                                                                                         |
+------------------------------------------------------------------------------------------------------------------------------------------------+
| CREATE TABLE <desired_database>.enabled_vars (                                                                                                        |
|   pid STRING NOT NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION,                                                                  |
|   var_code STRING NOT NULL ENCODING AUTO_ENCODING COMPRESSION DEFAULT_COMPRESSION,                                                             |
|   PRIMARY KEY (pid, var_code)                                                                                                                  |
| )                                                                                                                                              |
| PARTITION BY HASH (pid, var_code) PARTITIONS 2                                                                                                 |
| STORED AS KUDU                                                                                                                                 |
| TBLPROPERTIES ('STATS_GENERATED_VIA_STATS_TASK'='true', 'kudu.master_addresses'='<master1>,<master2>,<master3>') |
+------------------------------------------------------------------------------------------------------------------------------------------------+
Fetched 1 row(s) in 0.07s

3. Yes, it is met at the failing impala-daemon, but only more than a day before issue (see timestamp):

I0731 09:13:59.293942  4640 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.293975  4637 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.294001  4625 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.294025  4632 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.294055  4622 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.294075  4643 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.294252  4628 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.294342  4648 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.294668  4633 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.296530  4652 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.304276  4661 client-internal.cc:233] Reconnecting to the cluster for a new authn token
I0731 09:13:59.394750 29050 impala-internal-service.cc:44] ExecQueryFInstances(): query_id=464265b6715c2634:cdc2ab0500000000
I0731 09:13:59.395293 29050 query-exec-mgr.cc:46] StartQueryFInstances() query_id=464265b6715c2634:cdc2ab0500000000 coord=<another_coordinator_node>:22000
I0731 09:13:59.395375 29050 query-state.cc:173] Buffer pool limit for 464265b6715c2634:cdc2ab0500000000: 68719476736
I0731 09:13:59.396725 29050 initial-reservations.cc:60] Successfully claimed initial reservations (0) for query 464265b6715c2634:cdc2ab050
0000000

Do not see any errors closely after it.

4. No, can't find at the failing impala-daemon

Announcements