Reply
Expert Contributor
Posts: 113
Registered: ‎02-15-2016

kudu service are getting down frequently

Hi, Kudu crashing frequently with error Couldn't get the current time: Clock unsynchronized. Status: Service unavailable: Error reading clock. Clock considered unsynchronized however i am not seeing any clock offset error in cluster . kudu documents says this could be because of network delay between NTP server and kudu host. but kudu is sharing host with database while datanode is not reporting clock offset error . what could be the reason .

Cloudera Employee
Posts: 47
Registered: ‎02-05-2016

Re: kudu service are getting down frequently

HDFS datanodes don't require clock synchronization in the way that Kudu does.

 

Is NTP running on these nodes? What is the output of the 'ntptime' command? Are these nodes running on physical hardware, or something else?

Expert Contributor
Posts: 362
Registered: ‎01-25-2017

Re: kudu service are getting down frequently

I have the same issue, and didn't finish a solution for this.

For now i added a cron that restarted the ntpd service at all the server each hour.

 

This issue prevent me from going with Kudu to production as it doesn't make since to do the restart for 50 nodes each time.

Expert Contributor
Posts: 113
Registered: ‎02-15-2016

Re: kudu service are getting down frequently

@adar - yes DN not require NTP but if ntp is out of sync on these DNs CM will report clock offset .

NTP is running on DNs .

[root@wuwcw0hd3dn01 hadoop-hdfs]# ntptime
ntp_gettime() returns code 5 (ERROR)
time dce74398.988fc000 Sun, Jun 11 2017 0:20:40.595, (.595943),
maximum error 16000000 us, estimated error 16 us, TAI offset 0
ntp_adjtime() returns code 5 (ERROR)
modes 0x0 (),
offset 0.000 us, frequency 0.000 ppm, interval 1 s,
maximum error 16000000 us, estimated error 16 us,
status 0x4041 (PLL,UNSYNC,MODE),
time constant 7, precision 1.000 us, tolerance 500 ppm,
[root@wuwcw0hd3dn01 hadoop-hdfs]#

these all are physical servers
Cloudera Employee
Posts: 47
Registered: ‎02-05-2016

Re: kudu service are getting down frequently

You can avoid the dependency on ntpd by running Kudu with --use-hybrid-clock=false, but that has a serious effect on transactional consistency so it's not something we recommend. Instead, I'd focus your efforts on figuring out why your servers' time isn't synchronized. It may have to do with your ntp configuration.

 

Unfortunately I don't know how ntp works; perhaps you can search across past forum posts? If you do manage to fix this, please post your findings here; if it's a general purpose fix (i.e. not particular to your site configuration), we'll include it in the Kudu documentation.

 

Expert Contributor
Posts: 362
Registered: ‎01-25-2017

Re: kudu service are getting down frequently

@MSharma Did you find a solution for this?

 

i'm still stuck with it

Expert Contributor
Posts: 113
Registered: ‎02-15-2016

Re: kudu service are getting down frequently

not yet but restarting ntp service cause more trouble so i have put --use-hybrid-clock=false .
but mostly it is a network delay between ntp server and kudu server which is causing this .
i am still troubleshooting this problem ,will update here if we can do anything to reslove it
New Contributor
Posts: 2
Registered: ‎10-25-2016

Re: kudu service are getting down frequently

 I'm also in the trouble,when I restart the ntpd service,and it going to successful to restart the kudu service.But not a long time,it return to fail status.I see the kudu management page,there is a tip to solve the problem,it sail"for the master and tablet server daemons,the server’s clock must be synchronized using NTP.In addition,the maximum clock error(not to be mistaken with the estimate error) be below a configurable threshold.The default value is 10 seconds,but it can be set with the flag --max_clock_sync_error_usec." the kudu management page(https://kudu.apache.org/docs/troubleshooting.html) provide the solution,but I don't know how to and where to set the parameter"--max_clock_sync_error_usec." thanks.
Expert Contributor
Posts: 113
Registered: ‎02-15-2016

Re: kudu service are getting down frequently

kudu --> configuration -- "Kudu Service Advanced Configuration Snippet (Safety Valve) for gflagfile
Cloudera Employee
Posts: 70
Registered: ‎04-08-2014

Re: kudu service are getting down frequently

I would strongly recommend NOT running with hybrid time turned off for one
simple reason: tablet history GC will not work. Therefore when you delete
or update a row the history of that data will be kept forever. Eventually
you may run out of disk space. The one exception is if you drop a table,
then the data for that table will be permanently removed regardless of
hybrid time.