Reply
Explorer
Posts: 8
Registered: ‎02-15-2018

Cloudera 5.12 cluster randomly reports "Clock Offset Bad" with working NTP Server

[ Edited ]

Hi everyone,

 

I got an issue with clock offset. I use a cloudera quickstart VM with :

CDH 5.12

Red Hat Enterprise Linux 6 (64 bits)


My VM is housed somewhere in OVH servers so in doubt, I simply used in my ntp.conf file :

Capture.PNG

 

I restarted my ntp demon with "service ntp restart", and for a time, the sync is ok with one of the ntp servers and then falls down after about a minute. Here is the output of ntpdc -p :

Capture2.PNG

 

and ntpdc -c sysinfo :

Capture22.PNG

 

In the /var/log/cloudera-scm-agent/cloudera-scm-agent.log file, I have a warning from the host manager about having no ntp server in sync:

Capture3.PNG

 

My drift in /var/lib/ntp/drift is constant and equal to 499.985.

 

Any idea please ?

Best regards, Sélim

Explorer
Posts: 8
Registered: ‎02-15-2018

Re: Cloudera 5.12 cluster randomly reports "Clock Offset Bad" with working NTP Server

Hi everyone,

 

It's been one week now that I posted my issues with clock offset ... I checked the other posts related to these issues and tried several things as described in my previous post. Is there anyone who can help me please ?

 

Best regards, Sélim

Highlighted
Posts: 1,033
Topics: 1
Kudos: 257
Solutions: 127
Registered: ‎04-22-2014

Re: Cloudera 5.12 cluster randomly reports "Clock Offset Bad" with working NTP Server

@selim,

 

Indeed, something is wrong with your ntpd

 

ntpd logs to syslog, so you can probably check for information in /var/log/messages.

 

Also, while we would recommend resolving the issue, if you are confident that your clock will remain in sync with other hosts in the cluster, you can suppress or disable alerts for NTP issues.  The health checks are there since it is important that hosts remain in sync time-wise across your cluster.

 

 

Explorer
Posts: 8
Registered: ‎02-15-2018

Re: Cloudera 5.12 cluster randomly reports "Clock Offset Bad" with working NTP Server

Hi Bgooley,

 

Thank you for your answer. There is actually only one single host in the cloudera quickstart VM, to my understanding ... so I simply desactivated the warning about clock offset. Now the cluster is rather stable, but yesterday I still had the following error :

There are 1 datanode(s) running and 1 node is excluded in this operation

 

Today it seems to work, for whatever reasons, but do you have an idea of what was happening, for my understanding ? I emptied the /tmp directory and restarted HDFS in the meantime, that is maybe related ... ?

 

Thank you,

Best regards, Sélim

Announcements