Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Kudu TLS Error

avatar
Explorer

Hello,

We are running into an error with an external Java Client connecting through the Java API to connect to Kudu. On the server side we see a TLS connection being attempted:

W0616 13:01:28.007700 28840 negotiation.cc:290] Failed RPC negotiation. Trace:
0616 13:01:27.987227 (+     0us) reactor.cc:373] Submitting negotiation task for server connection from 10.x.x.xxx:64686
0616 13:01:27.987396 (+   169us) server_negotiation.cc:118] Beginning negotiation
0616 13:01:27.987397 (+     1us) server_negotiation.cc:282] Waiting for connection header
0616 13:01:27.987764 (+   367us) server_negotiation.cc:290] Connection header received
0616 13:01:27.988083 (+   319us) server_negotiation.cc:246] Received NEGOTIATE NegotiatePB request
0616 13:01:27.988084 (+     1us) server_negotiation.cc:331] Received NEGOTIATE request from client
0616 13:01:27.988098 (+    14us) server_negotiation.cc:258] Sending NEGOTIATE NegotiatePB response
0616 13:01:27.988120 (+    22us) server_negotiation.cc:139] Negotiated authn=SASL
0616 13:01:27.998217 (+ 10097us) server_negotiation.cc:246] Received TLS_HANDSHAKE NegotiatePB request
0616 13:01:27.998402 (+   185us) server_negotiation.cc:258] Sending TLS_HANDSHAKE NegotiatePB response
0616 13:01:28.001852 (+  3450us) server_negotiation.cc:246] Received TLS_HANDSHAKE NegotiatePB request
0616 13:01:28.005023 (+  3171us) server_negotiation.cc:258] Sending TLS_HANDSHAKE NegotiatePB response
0616 13:01:28.005048 (+    25us) server_negotiation.cc:496] Negotiated TLSv1.2 with cipher suite AES128-SHA256
0616 13:01:28.007617 (+  2569us) negotiation.cc:281] Negotiation complete: Network error: Server connection negotiation failed: server connection from 10.x.x.xxx:64686: BlockingRecv error: failed to read from TLS socket: Cannot send after transport endpoint shutdown (error 108)
Metrics: {"negotiator.queue_time_us":139,"thread_start_us":119,"threads_started":1}

The code we are using to connect is pretty straight forward:

KuduClient client = new KuduClient.KuduClientBuilder(KUDU_MASTER).build();

if (!client.getTablesList(tableName).getTablesList().isEmpty()){
System.out.println("deleting table if exists...");
client.deleteTable(tableName);
}

it fails in "if (!client.get....)" line

 

This is the stack trace from the Java app:

 

Exception in thread "main" org.apache.kudu.client.NoLeaderFoundException: Master config (kudumaster.xxxx.xxx:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: [Peer master-kudumaster.xxxx.xxx:7051] Connection disconnected
at org.apache.kudu.client.ConnectToCluster.incrementCountAndCheckExhausted(ConnectToCluster.java:241)
at org.apache.kudu.client.ConnectToCluster.access$000(ConnectToCluster.java:47)
at org.apache.kudu.client.ConnectToCluster$ConnectToMasterErrCB.call(ConnectToCluster.java:309)
at org.apache.kudu.client.ConnectToCluster$ConnectToMasterErrCB.call(ConnectToCluster.java:298)
at com.stumbleupon.async.Deferred.doCall(Deferred.java:1280)
at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1259)
at com.stumbleupon.async.Deferred.handleContinuation(Deferred.java:1315)
at com.stumbleupon.async.Deferred.doCall(Deferred.java:1286)
at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1259)
at com.stumbleupon.async.Deferred.callback(Deferred.java:1002)
at org.apache.kudu.client.KuduRpc.handleCallback(KuduRpc.java:238)
at org.apache.kudu.client.KuduRpc.errback(KuduRpc.java:292)
at org.apache.kudu.client.TabletClient.failOrRetryRpc(TabletClient.java:691)
at org.apache.kudu.client.TabletClient.failOrRetryRpcs(TabletClient.java:668)
at org.apache.kudu.client.TabletClient.cleanup(TabletClient.java:657)
at org.apache.kudu.client.TabletClient.channelDisconnected(TabletClient.java:608)
at org.apache.kudu.client.shaded.org.jboss.netty.channel.SimpleChannelUpstreamHandler.handleUpstream(SimpleChannelUpstreamHandler.java:102)
at org.apache.kudu.client.TabletClient.handleUpstream(TabletClient.java:601)
at org.apache.kudu.client.shaded.org.jboss.netty.channel.DefaultChannelPipeline.sendUpstream(DefaultChannelPipeline.java:564)
at org.apache.kudu.client.shaded.org.jboss.netty.channel.DefaultChannelPipeline$DefaultChannelHandlerContext.sendUpstream(DefaultChannelPipeline.java:791)
at

The cluster is kerberized with TLS/SSL enabled and has 3 master kudu servers with a handful of tablet servers. The same code is working fine on a non-kerberized cluster. 

 

 

Any help would be greatly appreciated!

1 ACCEPTED SOLUTION

avatar
Expert Contributor

This question has been posted in many different places and it seems to be related to the fact that this is run from a Windows host which has never been tested. It's not clear it's really a TLS issue.

 


@mbigelow wrote:
I would try loading the CA cert that signed the certs for the Kudu master and worker nodes into the JSSE truststore. Do this on the host you are running the Kudu app on. Then any Java applications will use it when negotiating TLS handshakes. You could also create a truststore and use that when launching the Kudu app.

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_sg_create_key_trust.html

That's not how it works in Kudu though, the CA cert is stored in the Kudu master, no messing with JSSE, see http://kudu.apache.org/docs/security.html

View solution in original post

5 REPLIES 5

avatar
Champion
I would try loading the CA cert that signed the certs for the Kudu master and worker nodes into the JSSE truststore. Do this on the host you are running the Kudu app on. Then any Java applications will use it when negotiating TLS handshakes. You could also create a truststore and use that when launching the Kudu app.

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_sg_create_key_trust.html

avatar
Expert Contributor

This question has been posted in many different places and it seems to be related to the fact that this is run from a Windows host which has never been tested. It's not clear it's really a TLS issue.

 


@mbigelow wrote:
I would try loading the CA cert that signed the certs for the Kudu master and worker nodes into the JSSE truststore. Do this on the host you are running the Kudu app on. Then any Java applications will use it when negotiating TLS handshakes. You could also create a truststore and use that when launching the Kudu app.

https://www.cloudera.com/documentation/enterprise/5-8-x/topics/cm_sg_create_key_trust.html

That's not how it works in Kudu though, the CA cert is stored in the Kudu master, no messing with JSSE, see http://kudu.apache.org/docs/security.html

avatar
Explorer

Thanks the responses back! J-D did you have an example post I can take a look at being that it could be since the Java client is on a Windows machine?

avatar
Explorer

I have ntp issue on my Cloudera manager 5.11.1 express edition. Due to ntp issue i am getting trouble on my kudu master & tablet server. 

Below is the error i am getting

 

I0910 17:03:52.896631 76243 heartbeater.cc:291] Connected to a master server at hadoopmaster1.local.net:7051
W0910 17:03:52.901327 76243 heartbeater.cc:498] Failed to heartbeat to hadoopmaster1.local.net:7051: Runtime error: failed to adopt master-signed X509 cert: could not verify certificate chain (error with cert: subject=UID = kudu, issuer=CN = kudu-ipki-ca): certificate is not yet valid

 

Meanwhile i am restarting Kudu every 1 hour due to this scenerio

avatar
Rising Star

The snippet posted shows that the tablet server is unable to verify the TLS certificate generated for the tablet server because the certificate 'valid from' field is in the future.  That's most likely because the master host's clock is at least 1 second ahead of the tablet server host's clock.

 

Tablet server TLS certificates are generated by master when the tablet server connects to the master first time after starting up.  Tablet server will retry the connection with next heartbeat to the master, sending a new certificate signing request, and the master will generate a new certificate, with the validity date in the future, again.

 

I suspect that the error will continue to appear even if you restart Kudu, and restarting Kudu will not help. You need to synchronize clock across machines in the cluster, at least within the delta of 1 second.  If NTP does not work for you, I would recommend trying at least to run 'ntpdate' at every machine of your cluster prior to starting Kudu servers.