Support Questions

Find answers, ask questions, and share your expertise

Mapreduce service check fails with ipc.Client connection timed out error

avatar
Contributor

MapReduce service check fails with ipc.Client connection timed out error

2018-07-30 14:39:43,127 - Execute['hadoop --config /usr/hdp/current/hadoop-client/conf jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples-2.*.jar wordcount /user/ambari-qa/mapredsmokeinput /user/ambari-qa/mapredsmokeoutput'] {'logoutput': True, 'try_sleep': 5, 'environment': {}, 'tries': 1, 'user': 'ambari-qa', 'path': ['/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin:/usr/hdp/current/hadoop-yarn-client/bin']}
18/07/30 14:39:45 INFO impl.TimelineClientImpl: Timeline service address: http://hostname:8188/ws/v1/timeline/
18/07/30 14:39:45 INFO client.RMProxy: Connecting to ResourceManager at hostname/ip:8050
18/07/30 14:39:45 INFO client.AHSProxy: Connecting to Application History server at hostname/ip:10200
18/07/30 14:40:49 INFO ipc.Client: Retrying connect to server: hostname/ip:8050. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
18/07/30 14:41:53 INFO ipc.Client: Retrying connect to server: hostname/ip:8050. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
18/07/30 14:42:57 INFO ipc.Client: Retrying connect to server: hostanme/ip:8050. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
18/07/30 14:44:01 INFO ipc.Client: Retrying connect to server: hostname/ip:8050. Already tried 3 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)

Logs:

WARN ipc.Client (Client.java:handleConnectionFailure(886)) - Failed to connect to server: ResourceManager-Hostname/ResourceManager-ip-address:8050: retries get failed due to exceeded maximum allowed retries number: 50 java.net.ConnectException: Connection timed out at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:744) at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531) at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495) at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:650) at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:745) at org.apache.hadoop.ipc.Client$Connection.access$3200(Client.java:397) at org.apache.hadoop.ipc.Client.getConnection(Client.java:1618) at org.apache.hadoop.ipc.Client.call(Client.java:1449) at org.apache.hadoop.ipc.Client.call(Client.java:1396) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at com.sun.proxy.$Proxy77.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationClientProtocolPBClientImpl.getApplicationReport(ApplicationClientProtocolPBClientImpl.java:191) at sun.reflect.GeneratedMethodAccessor58.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:278) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:194) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:176) at com.sun.proxy.$Proxy78.getApplicationReport(Unknown Source) at org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.isApplicationTerminated(AggregatedLogDeletionService.java:155) at org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.deleteOldLogDirsFrom(AggregatedLogDeletionService.java:101) at org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:85) at java.util.TimerThread.mainLoop(Timer.java:555) at java.util.TimerThread.run(Timer.java:505)

The port 8050 is open and listening

[root@bhwx24hwxworker2 yarn]# netstat --listen Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State tcp 0 0 *:8188 *:* LISTEN tcp 0 0 *:8030 *:* LISTEN tcp 0 0 *:8670 *:* LISTEN tcp 0 0 *:8191 *:* LISTEN tcp 0 0 *:sqlexec *:* LISTEN tcp 0 0 *:10020 *:* LISTEN tcp 0 0 *:eforward *:* LISTEN tcp 0 0 *:40070 *:* LISTEN tcp 0 0 localhost:40071 *:* LISTEN tcp 0 0 *:8040 *:* LISTEN tcp 0 0 *:40072 *:* LISTEN tcp 0 0 *:7337 *:* LISTEN tcp 0 0 *:fs-agent *:* LISTEN tcp 0 0 *:8141 *:* LISTEN tcp 0 0 *:45454 *:* LISTEN tcp 0 0 *:19888 *:* LISTEN tcp 0 0 bhwx24hwxworke:ciphire-serv *:* LISTEN tcp 0 0 *:10033 *:* LISTEN tcp 0 0 *:8050 *:* LISTEN tcp 0 0 *:39987 *:* LISTEN tcp 0 0 *:ssh *:* LISTEN tcp 0 0 *:7447 *:* LISTEN tcp 0 0 *:trisoap *:* LISTEN tcp 0 0 *:radan-http *:* LISTEN tcp 0 0 *:irisa *:* LISTEN tcp 0 0 *:ca-audit-da *:* LISTEN tcp 0 0 localhost:8089 *:* LISTEN tcp 0 0 localhost:metasys *:* LISTEN tcp 0 0 localhost:smtp *:* LISTEN tcp 0 0 *:13562 *:* LISTEN tcp 0 0 *:ssh *:* LISTEN udp 0 0 bhwx24hwxworker2.cse-int:ntp *:* udp 0 0 localhost:ntp *:* udp 0 0 *:ntp *:* udp 0 0 *:bootpc *:* udp 0 0 *:ntp *:*

2 REPLIES 2

avatar
Expert Contributor

Check whether you are able to telnet to RM:8050 and also check netstat output on RM machine whether you see any connections from node on which service check is running.

avatar
Contributor

@schhabra : Thanks for the response, The service check is getting fired from the same host where RM is installed .

18/07/31 11:11:34 INFO impl.TimelineClientImpl: Timeline service address: http://RM-host:8188/ws/v1/timeline/
18/07/31 11:11:34 INFO client.RMProxy: Connecting to ResourceManager at RM-host/RM-ip:8050
18/07/31 11:11:35 INFO client.AHSProxy: Connecting to Application History server at RM-host/RM-ip:10200
18/07/31 11:12:39 INFO ipc.Client: Retrying connect to server: RM-host/RM-ip:8050. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=50, sleepTime=1000 MILLISECONDS)
18/07/31 11:13:43 INFO ipc.Client: Retrying connect to server: RMhost/RM-ip:8050. Already tried 1 time(s); retry policy is