- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
KUDU NoLeaderFound Issue
- Labels:
-
Apache Kudu
-
Apache YARN
Created ‎02-27-2022 11:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear:
Recently i met a KUDU issue which was randomly occurred. Below is the details.
CDH Version: 5.14.2
KUDU Version: 1.6.0-cdh5.14.2
ERROR Information:
21/11/28 06:56:03 ERROR yarn.ApplicationMaster: User class threw exception: org.apache.kudu.client.NoLeaderFoundException: Master config (10.186.93.6:7051,10.186.93.24:7051,10.186.93.8:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: connection disconnected,org.apache.kudu.client.RecoverableException: connection disconnected
org.apache.kudu.client.NoLeaderFoundException: Master config (10.186.93.6:7051,10.186.93.24:7051,10.186.93.8:7051) has no leader. Exceptions received: org.apache.kudu.client.RecoverableException: connection disconnected,org.apache.kudu.client.RecoverableException: connection disconnected
at org.apache.kudu.client.ConnectToCluster.incrementCountAndCheckExhausted(ConnectToCluster.java:273)
at org.apache.kudu.client.ConnectToCluster.access$100(ConnectToCluster.java:49)
at org.apache.kudu.client.ConnectToCluster$ConnectToMasterErrCB.call(ConnectToCluster.java:349)
at org.apache.kudu.client.ConnectToCluster$ConnectToMasterErrCB.call(ConnectToCluster.java:338)
at com.stumbleupon.async.Deferred.doCall(Deferred.java:1280)
at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1259)
at com.stumbleupon.async.Deferred.handleContinuation(Deferred.java:1315)
at com.stumbleupon.async.Deferred.doCall(Deferred.java:1286)
at com.stumbleupon.async.Deferred.runCallbacks(Deferred.java:1259)
at com.stumbleupon.async.Deferred.callback(Deferred.java:1002)
...
Caused by: org.apache.kudu.client.RecoverableException: connection disconnected
... 55 more
this issue happened when i submitted a SPARK style task to YARN. Most time it is fine, but sometimes the problem come. Maybe several times per day.
Everytime I have to re-submit my task to YARN again, luckly the second time try always succeed.
Is it a BUG or something else? Is there any way to fix or workaround?
Appreciate.
Created ‎02-28-2022 12:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Kudu has a hard requirement on having an up-to-date NTP. Kudu masters and tablet servers will crash when out of sync.
you can check whether your ntp service is up to date
Created ‎02-28-2022 12:34 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks your reply bro, but the ntp sync looks fine because you know CDH cluster has the check for time sync.
Is there any configuration suggestion for my kudu cluster?
I set 3 masters and 76 kudu tablet servers. The resource for each node is about 48cores and 384G mem and 5T disk space.
We have a high read/write working scenario, is it a reason to cause this?
Created ‎02-28-2022 01:17 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi bro,
you're welcome!
I have no idea about the issue assuming the ntp server is fine.
I searched from some blogs, and some guys recommended to use a host name instead of ip.
you can have a try.
I am curious about the issue, and will pay attention to it.
looking forword to see "the solutions".
Good luck !
Created ‎02-28-2022 01:31 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
transfer the way of connection to hostname?
it sounds like choosing a long way instead of the short one.
But maybe it works. OK, i'll have a try.
Wait for my good news.
🙂
Created ‎02-28-2022 01:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
the answer is from https://www.cnblogs.com/chong-zuo3322/p/15934491.html
Created ‎03-15-2022 06:10 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
sad news. It was not working. The error message turns to "org.apache.kudu.client.NoLeaderFoundException: Master config (hostnamex:7051,hostnamex:7051,hostnamex:7051)"
Are there any reasons to cause this issue? Such as high pressure on KUDU cluster?
Created ‎02-28-2022 01:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You should always use the fully qualified hostnames, instead of IP addresses.
Although it's longer, it can prevent problems. If your cluster is using TLS, for example, the full hostnames are required.
I imagine these logs that you shared are client logs, right?
Can you check the Tablet Server logs and see if there are errors in them? Those would help understand the issue.
André
Was your question answered? Please take some time to click on "Accept as Solution" below this post.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Created ‎02-28-2022 01:59 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
thanks for response.
the logs are really from client. I checked the KUDU logs for ERROR, but nothing was found.
now i have replaced the IP to hostname.
hope it works.
