Created 02-04-2016 03:11 PM
I am struggling to fix the issue that I am facing while executing hadoop mareduce jobs in my cluster. I am running the mapreduce job on the cluster created through Ambari (not sandbox). The cluster has 4 nodes (including the master node). Following is the error that I get
This token is expired. current time is 1454617494914 found 1454598336617 Note: System times on machines may be out of sync. Check system time and time zones.
I checked the time on all the nodes. I found that, except the master node, time on all the other nodes were incorrect. So I manually corrected (ntpd was failing to connect to servers) the time on all the nodes.
Searching the internet, I found that there is a setting 'yarn.resourcemanager.rm.container-allocation.expiry-interval-ms' which can be used to increase the lifespan of the container. I could not find this setting anywhere in the advanced configuration on the Ambari dashboard. Can anyone help me understand what is going on ?
Created 02-04-2016 03:14 PM
This is the exact root cause
I checked the time on all the nodes. I found that, except the master node, time on all the other nodes were incorrect. So I manually corrected (ntpd was failing to connect to servers) the time on all the nodes.
Do you know why ntpd is failing?
Created 02-04-2016 03:11 PM
install NTP @Pradeep kumar
Created 02-04-2016 03:12 PM
# Setup NTPD chkconfig --list ntpd chkconfig ntpd on service ntpd stop ntpdate pool.ntp.orgservice ntpd start
Created 02-04-2016 03:13 PM
@Pradeep kumar I also think the issue is not with YARN but with kerberos.
Created 02-04-2016 03:15 PM
above is for RHEL6 for RHEL7 below
# Setup NTPD yum install -y ntp systemctl is-enabled ntpd systemctl enable ntpd # enable firewall rules for ntp firewall-cmd --add-service=ntp --permanent firewall-cmd --reload systemctl stop ntpd ntpdate pool.ntp.org systemctl start ntpd systemctl status ntpd echo "wait 30 sec for time to synchronize" sleep 30 ntpq -p date -R
Created 02-04-2016 04:05 PM
I have installed the cluster using CentOS. So, it would be great if you could post the CentOS version for setting the firewall rule. This could be reason, why my nodes are not able to contact the time servers. Many thanks.
Created 02-04-2016 04:10 PM
here you go @Pradeep kumar link
Created 02-05-2016 01:04 PM
Thanks Artem for your support. The problem was not with the firewall, but the nodes were not able to reach the known ntpd time servers.
Created 02-05-2016 01:40 PM
@Pradeep kumar great, glad to help.
Created 02-04-2016 04:31 PM
I have tried disabling the firewall and running the command '/usr/sbin/ntpdate pool.ntp.org'. But, I am getting the error "no server suitable for synchronization found".