Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Yarn jar mapreduce-examples.jar pi 5 10 fails with socket timeout exception

avatar
New Member

Hello,

I am running below command from the map reduce examples for PI, it is failing and I can see socket timeout exception in the logs.

I am not able to find a solution anywhere till now, would be glad if someone can help.

Command: yarn jar hadoop-mapreduce-examples.jar pi 5 10

(From the directory: /usr/hdp/2.3.0.0-2557/hadoop-mapreduce)

Below is the log trace:

2016-04-20 06:12:48,333 WARN [RMCommunicator Allocator] org.apache.hadoop.ipc.Client: Exception encountered while connecting to the server : java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.17.0.2:53751 remote=node1/172.17.0.2:8030]
2016-04-20 06:12:51,884 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator: ERROR IN CONTACTING RM. 
java.io.IOException: Failed on local exception: java.io.IOException: java.net.SocketTimeoutException: 60000 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/172.17.0.2:53751 remote=node1/172.17.0.2:8030]; Host Details : local host is: "node1/172.17.0.2"; destination host is: "node1":8030; 
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:773

I can see the property in advances yarn site -- yarn.resourcemanager.scheduler.address node1:8030

Hosts file entry:

[root@node1 ~]# cat /etc/hosts 172.17.0.2 node1

127.0.0.1 localhost ::1 localhost ip6-localhost ip6-loopback fe00::0 ip6-localnet ff00::0 ip6-mcastprefix ff02::1 ip6-allnodes ff02::2 ip6-allrouters [root@node1 ~]#

Not sure what is the problem. I can ping localhost / node1/127.0.0.1 from node1 terminal.

Regards,

Vinay MP

1 ACCEPTED SOLUTION

avatar
New Member

Finally I managed to get a new 16GB machine where I can run the VM with good performance.

As an initial practice I was using a 8GB machine.

I used the same VM, the command went through fine in 16GB machine and it failed in 8GB machine.

Not exactly sure whether memory wasn't sufficient (i didn't see any OOM / related exceptions in 8GB machine) to run these tests in 8GB machine but I am glad the problem is solved.

@Ian Roberts, @Predrag Minovic Thanks for taking time to reply.

Regards,

Vinay MP

View solution in original post

4 REPLIES 4

avatar
Expert Contributor

Hi @Vinay MP , It seems you are not able to contact the ResourceManager. What port is RM listening on? You should be able to do a ps -ef | grep resourcemanager and then do a netstat -tulpn | grep <PID> to find out.

avatar
Master Guru

In your /etc/hosts, move the line "172.17.0.2 node1" from the top to line number 2:

127.0.0.1	localhost
172.17.0.2	node1

Then, run "hostname", it should be "node1", if not run "hostname node1". Also check your hostname in /etc/sysconfig/network file. And finally, as Ian suggested check whether RM is up and running and listening on ports 8030 and 8050 (and a few other ones).

avatar
New Member

Hi @Ian Roberts , @Predrag Minovic

Thanks for the suggestions. I will try them and update.

As per now I checked netstat and I was able to see resourcemanager was up and listening on 8030, 8050 and few more ports.

All of a sudden I am not able to open terminal session to Node1 (one of the host in my VM). I will fix that and verify the mapreduce example.

Regards,

Vinay MP

avatar
New Member

Finally I managed to get a new 16GB machine where I can run the VM with good performance.

As an initial practice I was using a 8GB machine.

I used the same VM, the command went through fine in 16GB machine and it failed in 8GB machine.

Not exactly sure whether memory wasn't sufficient (i didn't see any OOM / related exceptions in 8GB machine) to run these tests in 8GB machine but I am glad the problem is solved.

@Ian Roberts, @Predrag Minovic Thanks for taking time to reply.

Regards,

Vinay MP