Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Token expiry issue due to System time on machines.

avatar
Expert Contributor

I am struggling to fix the issue that I am facing while executing hadoop mareduce jobs in my cluster. I am running the mapreduce job on the cluster created through Ambari (not sandbox). The cluster has 4 nodes (including the master node). Following is the error that I get

This token is expired. current time is 1454617494914 found 1454598336617 Note: System times on machines may be out of sync. Check system time and time zones.

I checked the time on all the nodes. I found that, except the master node, time on all the other nodes were incorrect. So I manually corrected (ntpd was failing to connect to servers) the time on all the nodes.

Searching the internet, I found that there is a setting 'yarn.resourcemanager.rm.container-allocation.expiry-interval-ms' which can be used to increase the lifespan of the container. I could not find this setting anywhere in the advanced configuration on the Ambari dashboard. Can anyone help me understand what is going on ?

1 ACCEPTED SOLUTION

avatar
Master Mentor
@Pradeep kumar

This is the exact root cause

I checked the time on all the nodes. I found that, except the master node, time on all the other nodes were incorrect. So I manually corrected (ntpd was failing to connect to servers) the time on all the nodes.

Do you know why ntpd is failing?

View solution in original post

17 REPLIES 17

avatar
Master Mentor

@Pradeep kumar its a common error, Google search gave me this Link

avatar
Master Mentor

@Pradeep kumar also make sure your firewall accepts all traffic from servers in the cluster. You can open ports granularly or allow all traffic from node. Refer to Centos docs for instructions

avatar
Master Mentor
@Pradeep kumar

This is the exact root cause

I checked the time on all the nodes. I found that, except the master node, time on all the other nodes were incorrect. So I manually corrected (ntpd was failing to connect to servers) the time on all the nodes.

Do you know why ntpd is failing?

avatar
Expert Contributor

Thanks Neeraj. I wish I could update the time using the ntpd, but I tried all commands to update the system time, but I kept getting the error "4 Feb 21:30:55 ntpdate[12169]: no server suitable for synchronization found". I have gone through a lot of materials on internet that discusses about this error, but none of the suggestions helped me, so I thought of doing it manually. Okay. I will check with my company network support and see if it is a problem with firewall, due to which ntpd is not able to sycn with the server.

avatar
Master Mentor

avatar
Expert Contributor

Also, I would like to understand, why setting the time manually will not resolve the problem of time synching. If I type "date", it shows me almost the same time on all the nodes now. The dates are same, but the time varies only by a few seconds.

avatar

a few seconds isn't going to matter. Kerberos and the security system is fussy about clocks.

you can usually set your network switch up as an NTP server, so they can all sync with that. Or turn one of your machines into the NTP server and again, make it a reference source of time. Ideally, if detached from the network, you could hook up a GPS unit and run gpsd to be as accurate as pretty much everything else on the internet

avatar
Expert Contributor

Thanks Neeraj for your support. The problem was NTPD. The problem occurred because my nodes could not reach the known ntpd time servers. So I got the address of an internal ntpd server in my company and everything started working fine.