Created on 06-16-2017 01:44 PM - edited 09-16-2022 04:46 AM
Hello,
We are running into an error with an external Java Client connecting through the Java API to connect to Kudu. On the server side we see a TLS connection being attempted:
W0616 13:01:28.007700 28840 negotiation.cc:290] Failed RPC negotiation. Trace: 0616 13:01:27.987227 (+ 0us) reactor.cc:373] Submitting negotiation task for server connection from 10.x.x.xxx:64686 0616 13:01:27.987396 (+ 169us) server_negotiation.cc:118] Beginning negotiation 0616 13:01:27.987397 (+ 1us) server_negotiation.cc:282] Waiting for connection header 0616 13:01:27.987764 (+ 367us) server_negotiation.cc:290] Connection header received 0616 13:01:27.988083 (+ 319us) server_negotiation.cc:246] Received NEGOTIATE NegotiatePB request 0616 13:01:27.988084 (+ 1us) server_negotiation.cc:331] Received NEGOTIATE request from client 0616 13:01:27.988098 (+ 14us) server_negotiation.cc:258] Sending NEGOTIATE NegotiatePB response 0616 13:01:27.988120 (+ 22us) server_negotiation.cc:139] Negotiated authn=SASL 0616 13:01:27.998217 (+ 10097us) server_negotiation.cc:246] Received TLS_HANDSHAKE NegotiatePB request 0616 13:01:27.998402 (+ 185us) server_negotiation.cc:258] Sending TLS_HANDSHAKE NegotiatePB response 0616 13:01:28.001852 (+ 3450us) server_negotiation.cc:246] Received TLS_HANDSHAKE NegotiatePB request 0616 13:01:28.005023 (+ 3171us) server_negotiation.cc:258] Sending TLS_HANDSHAKE NegotiatePB response 0616 13:01:28.005048 (+ 25us) server_negotiation.cc:496] Negotiated TLSv1.2 with cipher suite AES128-SHA256 0616 13:01:28.007617 (+ 2569us) negotiation.cc:281] Negotiation complete: Network error: Server connection negotiation failed: server connection from 10.x.x.xxx:64686: BlockingRecv error: failed to read from TLS socket: Cannot send after transport endpoint shutdown (error 108) Metrics: {"negotiator.queue_time_us":139,"thread_start_us":119,"threads_started":1}
The code we are using to connect is pretty straight forward:
KuduClient client = new KuduClient.KuduClientBuilder(KUDU_MASTER).build(); if (!client.getTablesList(tableName).getTablesList().isEmpty()){ System.out.println("deleting table if exists..."); client.deleteTable(tableName); }
it fails in "if (!client.get....)" line
This is the stack trace from the Java app:
Exception in thread "main" org.apache.kudu.client.NoLeaderFoundException: Master config (kudumaster.xxxx.xxx:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [Peer master-kudumaster.xxxx.xxx:7051] Connection disconnected at org.apache.kudu.client.ConnectToCluster.incrementCountAndCheckExhausted(ConnectToCluster.java:241) at org.apache.kudu.client.ConnectToCluster.access$000(ConnectToCluster.java:47) at org.apache.kudu.client.ConnectToCluster$ConnectToMasterErrCB.call(ConnectToCluster.java:309) at org.apache.kudu.client.ConnectToCluster$ConnectToMasterErrCB.call(ConnectToCluster.java:298) at com.stumbleupon.async.Deferred.doCall(Deferred.java:1280) at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1259) at com.stumbleupon.async.Deferred.handleContinuation(Deferred.java:1315) at com.stumbleupon.async.Deferred.doCall(Deferred.java:1286) at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1259) at com.stumbleupon.async.Deferred.callback(Deferred.java:1002) at org.apache.kudu.client.KuduRpc.handleCallback(KuduRpc.java:238) at org.apache.kudu.client.KuduRpc.errback(KuduRpc.java:292) at org.apache.kudu.client.TabletClient.failOrRetryRpc(TabletClient.java:691) at org.apache.kudu.client.TabletClient.failOrRetryRpcs(TabletClient.java:668) at org.apache.kudu.client.TabletClient.cleanup(TabletClient.java:657) at org.apache.kudu.client.TabletClient.channelDisconnected(TabletClient.java:608) at org.apache.kudu.client.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102) at org.apache.kudu.client.TabletClient.handleUpstream(TabletClient.java:601) at org.apache.kudu.client.shaded.org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564) at org.apache.kudu.client.shaded.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791) at
The cluster is kerberized with TLS/SSL enabled and has 3 master kudu servers with a handful of tablet servers. The same code is working fine on a non-kerberized cluster.
Any help would be greatly appreciated!
Created 06-17-2017 10:26 AM
This question has been posted in many different places and it seems to be related to the fact that this is run from a Windows host which has never been tested. It's not clear it's really a TLS issue.
@mbigelow wrote:
I would try loading the CA cert that signed the certs for the Kudu master and worker nodes into the JSSE truststore. Do this on the host you are running the Kudu app on. Then any Java applications will use it when negotiating TLS handshakes. You could also create a truststore and use that when launching the Kudu app.
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_sg_create_key_trust.html
That's not how it works in Kudu though, the CA cert is stored in the Kudu master, no messing with JSSE, see http://kudu.apache.org/docs/security.html
Created 06-16-2017 07:17 PM
Created 06-17-2017 10:26 AM
This question has been posted in many different places and it seems to be related to the fact that this is run from a Windows host which has never been tested. It's not clear it's really a TLS issue.
@mbigelow wrote:
I would try loading the CA cert that signed the certs for the Kudu master and worker nodes into the JSSE truststore. Do this on the host you are running the Kudu app on. Then any Java applications will use it when negotiating TLS handshakes. You could also create a truststore and use that when launching the Kudu app.
https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_sg_create_key_trust.html
That's not how it works in Kudu though, the CA cert is stored in the Kudu master, no messing with JSSE, see http://kudu.apache.org/docs/security.html
Created 06-19-2017 08:18 AM
Thanks the responses back! J-D did you have an example post I can take a look at being that it could be since the Java client is on a Windows machine?
Created 09-11-2017 09:00 AM
I have ntp issue on my Cloudera manager 5.11.1 express edition. Due to ntp issue i am getting trouble on my kudu master & tablet server.
Below is the error i am getting
I0910 17:03:52.896631 76243 heartbeater.cc:291] Connected to a master server at hadoopmaster1.local.net:7051
W0910 17:03:52.901327 76243 heartbeater.cc:498] Failed to heartbeat to hadoopmaster1.local.net:7051: Runtime error: failed to adopt master-signed X509 cert: could not verify certificate chain (error with cert: subject=UID = kudu, issuer=CN = kudu-ipki-ca): certificate is not yet valid
Meanwhile i am restarting Kudu every 1 hour due to this scenerio
Created 09-20-2017 03:21 PM
The snippet posted shows that the tablet server is unable to verify the TLS certificate generated for the tablet server because the certificate 'valid from' field is in the future. That's most likely because the master host's clock is at least 1 second ahead of the tablet server host's clock.
Tablet server TLS certificates are generated by master when the tablet server connects to the master first time after starting up. Tablet server will retry the connection with next heartbeat to the master, sending a new certificate signing request, and the master will generate a new certificate, with the validity date in the future, again.
I suspect that the error will continue to appear even if you restart Kudu, and restarting Kudu will not help. You need to synchronize clock across machines in the cluster, at least within the delta of 1 second. If NTP does not work for you, I would recommend trying at least to run 'ntpdate' at every machine of your cluster prior to starting Kudu servers.