Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Kudu T-Server Error

Highlighted

Kudu T-Server Error

Explorer

Hi All,

 

I am getting the below error on one of my Kudu tablet server, I have restarted table server services on this host yet when I check them I continue to get this error 

W1118 19:43:31.815698 33067 consensus_peers.cc:435] T 3292e490cf4843d994a45f9a4c7782c0 P cc36320dd81646d081a24203751c2a6a -> Peer 164c8bcafccc4fd0adfb6dfe7a2ff60e (MYSERVER.com:7050): Couldn't send request to peer 164c8bcafccc4fd0adfb6dfe7a2ff60e for tablet 3292e490cf4843d994a45f9a4c7782c0. Error code: TABLET_NOT_RUNNING (12). Status: Illegal state: Tablet not RUNNING: INITIALIZED. Retrying in the next heartbeat period. Already tried 389 times.

Any help is much  appreciated

 

Regards

Amn 

6 REPLIES 6

Re: Kudu T-Server Error

Contributor

That message indicates that the Kudu tserver is in the process of bootstrapping all of its tablet replicas. It hasn't gotten to tablet 3292e490cf4843d994a45f9a4c7782c0 yet though, but it should soon. If you look at MYSERVER.com:8050/tablets, you should be able to see the current state of the tablet replicas on that tablet server (INITIALIZED, BOOTSTRAPPING, RUNNING, etc.).

 

The bootstrapping process being slow can indicate a number of things, like there being a large number of tablet replicas on that particular tablet server (in which case you might want to rebalance the cluster using the rebalancer tool), or that the WAL disk is slow (in which case you might want to use a faster disk for the -fs_wal_dir, since the disk is shared among all tablet replicas).

 

Hope this helped!

Highlighted

Re: Kudu T-Server Error

Explorer

Hi Awong,

 

Thanks for the quick reply, following is what I see, based on the screenshot what I understand is that the bootstrap process is completed, as it says 100%, also when I click on Details > toggle I see Under Last Status as either -Bootstrap complete. or No bootstrap required, opened a new log, for the corresponding Table Name.

When I check the logs I still see the same previous error. anything else I can check??Capture.JPG

Regards

Amn

Highlighted

Re: Kudu T-Server Error

Contributor

You should check your other tablet servers. Those logs may be indicating that some of the tablet replicas are trying to communicate with other replicas on other servers, but the replicas on other servers are still bootstrapping.

 

Or are all of your tablet servers done bootstrapping?

Highlighted

Re: Kudu T-Server Error

Explorer

@awong 

 

Checked all 9 tablet servers all are done bootstrapping, I see the same results as I posted in the previous screenshot although the numbers are different, but its all at  100%

 

Regards

Amn

Highlighted

Re: Kudu T-Server Error

Contributor

Hm, that's pretty odd. And the messages are still coming in? These aren't old messages?

 

If you run `kudu cluster ksck` on your cluster, what does it say about the health of that tablet?

Highlighted

Re: Kudu T-Server Error

Explorer

I see all my Tablet Servers Healthy and the Summary by Table also shows them Healthy. Nothing in 'Recovering / Under-Replicated / Unavailable'

Don't have an account?
Coming from Hortonworks? Activate your account here