Reply
Highlighted
New Contributor
Posts: 5
Registered: ‎09-04-2013
Accepted Solution

Kudu ntpd only - no chrony

Is anyone aware of a *technical* (not personal preference) reason why Kudu close-coupled to ntpd only? At the moment we are hand-modying scripts for each Kudu release to support our chrony based environment.

Cloudera Employee
Posts: 51
Registered: ‎09-28-2015

Re: Kudu ntpd only - no chrony

Hi Derek,

No technical reason as far as I'm aware, but haven't done research into
chrony, nor has anyone done any testing. So long as chrony uses the same
adjtimex() syscalls to keep the kernel apprised of clock synchronization
status, including maxerror estimates, it should work OK.

What scripts are you having to hand-modify?

-Todd
New Contributor
Posts: 5
Registered: ‎09-04-2013

Re: Kudu ntpd only - no chrony

I would have throught the same but it does not appear to turn out that way, at least for us.

 

We mod kudu-tserver to stop chrony and start ntpd (with equivalent config) before 

exec ${KUDU_HOME}/sbin/kudu-tserver "$@"

 

Works with ntpd running, fails with chrony 

 

Chrony running log message

Check failed: _s.ok() Bad status: Service unavailable: Cannot initialize clock: Error: Clock synchronized but error wastoo high (10057805 us).

The clocks on all hosts as syncronised fine.

 

With nptd everything works fine.

Cloudera Employee
Posts: 51
Registered: ‎09-28-2015

Re: Kudu ntpd only - no chrony

Sounds like chrony isn't setting the maxerror estimate in the kernel. Kudu
depends on knowing a strict bound on the clock error between machines --
even if the clocks are well synchronized, it's important to know what the
absolute worst case error is to avoid giving incorrect results.

If you are willing to roll the dice and potentially receive inconsistent
results, you can change the --max_clock_sync_error_usec configuration flag
to a larger value.

-Todd
New Contributor
Posts: 5
Registered: ‎09-04-2013

Re: Kudu ntpd only - no chrony

--max_clock_sync_error_usec is probably not the path we'd like to pursue.

We'll investigate the maxerror estimate path with chrony

New Contributor
Posts: 5
Registered: ‎07-01-2016

Re: Kudu ntpd only - no chrony

My 2 cents: before dumping Chrony, evaluate the actual quality of the time sync on your servers. And try to tweak some settings if necessary.

 

On our OpenStack VMs... 

 

§ cat /sys/devices/system/clocksource/clocksource0/current_clocksource

 kvm-clock

 

=> consistent with RedHat documentation since that's a VM

 

§ ntpstat

 synchronised to NTP server (10.x.x.x) at stratum 5

 time correct to within 105 ms

 polling server every 1024 s

 

=> NTP resync polling every 17 minutes only (and sometimes it misses the deadline, see below)

 

§ while [[ 1 -eq 1 ]] ; do sleep 10s ; ntptime | sed -n '3 s/,.*$//p' ; done

 maximum error 813965 us

 maximum error 818965 us

 maximum error 823965 us

 ...

 maximum error 1129465 us

 maximum error 83712 us

 maximum error 88712 us

...

 

=> so the "max error" went +5 ms for every 10 s elapsed, up it reached 1130 ms, at which point a NTP resync happened and it dropped to ~80 ms, etc

 

Simple math: if the resync had happened after 1024 s as expected, the "max error" would have been capped at  80+102*5=590 ms, but here we reached 1130 ms??

 

Well, repeating the observation showed that the sync usually happens every 1024 s, except when Chrony misses the deadline and syncs after 2x 1024 s -- or maybe 3x etc.

 

Anyway, in our case the "max error" stays confortably below the Kudu limit, which is 10 s by default, cf. source code in https://github.com/cloudera/kudu/blob/master/src/kudu/server/hybrid_clock.cc

So Chrony is a good fit for us :-)

 

Announcements
Unanswered Topics
No posts to display.