08-24-2015 06:32 PM
I have two different Cloudera Manager(CM) managed clusters that I upgraded from 5.3.3 to (5.4.1 and 5.4.3). On both clusters I get random "Clock Offset Bad" messages thoughout the day. I checked the times on all nodes and they are all in sync. I also checked the NTP configuration and it appears to be accurate. I did not have these issues when running 5.3.3. I was wondering how the CM agent checks for bad offset clock? Is there something that changed between 5.3.3 and 5.4.x?
OS: Redhat Enterprise 6.6
OS: Redhat Enterprise 6.6
08-28-2015 09:23 AM
Hello. The clock NTP health check is executed by each agent running on nodes on your cluster. The command executed is:
08-28-2015 03:11 PM
10-21-2015 07:28 AM
Has anybody found a solution for this? I am running the VM on ESXi.
In my case ff I run 'ntpdc -np' after a bad allert I get following values:
remote local st poll reach delay offset disp =================================================== =188.8.131.52 192.168.2.251 2 64 177 0.00102 -46.89972 0.25188 =184.108.40.206 192.168.2.251 2 64 177 0.00383 -46.90043 0.25189 =220.127.116.11 192.168.2.251 2 64 177 0.00117 -46.89955 0.25185
After 'service ntpd restart' offsets with a harwdare clock seems to be fixed:
remote local st poll reach delay offset disp ======================================================================= =18.104.22.168 192.168.2.251 16 64 0 0.00000 0.000000 4.00000 *22.214.171.124 192.168.2.251 2 64 1 0.00117 0.000123 2.81735 =126.96.36.199 192.168.2.251 16 64 0 0.00000 0.000000 4.00000
But after few minutes offsets drifts away. Any ideas?
11-04-2015 01:25 PM
11-16-2015 11:20 PM - edited 11-16-2015 11:33 PM
I had same issue on one of the 10 node cluster built using VMware hypervisor, so all 10 VM's running on hypervisor.
I actually did not face clock offset issue when I built cloudera cluster on a physical machines without using hypervisor. I feel there is latency issue with time sync of 3 minutes while on hypervisor.
Please let us know if it is recommended to have hadoop cluster over hypervisor or not.
04-09-2016 03:35 PM
what is the "=" exactly mean then? I am not seeing anything in th cm agent logs or syslog. from what i read from https://www.eecis.udel.edu/~mills/ntp/html/ntpdc.html
"a = means the remote server is being polled in client mode". So does not mean we should look for those lines as well? My entire environment is alerting on this now, all of a sudden and I have not found any issue yet as to the cause. My environment is running Ubuntu 12.04 and some 14.04 and ntpdate works with the configured server i have set.