Hi, Kudu crashing frequently with error Couldn't get the current time: Clock unsynchronized. Status: Service unavailable: Error reading clock. Clock considered unsynchronized however i am not seeing any clock offset error in cluster . kudu documents says this could be because of network delay between NTP server and kudu host. but kudu is sharing host with database while datanode is not reporting clock offset error . what could be the reason .
HDFS datanodes don't require clock synchronization in the way that Kudu does.
Is NTP running on these nodes? What is the output of the 'ntptime' command? Are these nodes running on physical hardware, or something else?
I have the same issue, and didn't finish a solution for this.
For now i added a cron that restarted the ntpd service at all the server each hour.
This issue prevent me from going with Kudu to production as it doesn't make since to do the restart for 50 nodes each time.
You can avoid the dependency on ntpd by running Kudu with --use-hybrid-clock=false, but that has a serious effect on transactional consistency so it's not something we recommend. Instead, I'd focus your efforts on figuring out why your servers' time isn't synchronized. It may have to do with your ntp configuration.
Unfortunately I don't know how ntp works; perhaps you can search across past forum posts? If you do manage to fix this, please post your findings here; if it's a general purpose fix (i.e. not particular to your site configuration), we'll include it in the Kudu documentation.