Created on 10-15-2014 01:16 AM - edited 09-16-2022 02:09 AM
I'm trying to run the Hadoop pi example. It was running without any problems on a single node. But now I'm working on a multinode and its giving the below error. If anyone could please advise.
mapred-site.xml:
<?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Put site-specific property overrides in this file. --> <configuration> <!-- In: conf/mapred-site.xml --> <property> <name>mapred.job.tracker</name> <value>master:54311</value> <description>The host and port that the MapReduce job tracker runs at. If "local", then jobs are run in-process as a single map and reduce task. </description> </property> <property> <name>mapred.child.java.opts</name> <value>-Xmx2048m</value> </property> <property> <name>mapred.shuffle.input.buffer.percent</name> <value>0.2</value> </property> </configuration>
Console output:
Number of Maps = 3 Samples per Map = 10 14/10/11 20:34:20 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000 14/10/11 20:34:54 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id Wrote input for Map #0 Wrote input for Map #1 Wrote input for Map #2 Starting Job 14/10/11 20:34:54 WARN mapreduce.JobSubmitter: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same. 14/10/11 20:34:55 INFO input.FileInputFormat: Total input paths to process : 3 14/10/11 20:34:55 WARN conf.Configuration: mapred.map.tasks is deprecated. Instead, use mapreduce.job.maps 14/10/11 20:34:55 INFO mapreduce.JobSubmitter: number of splits:3 14/10/11 20:34:55 INFO mapreduce.JobSubmitter: adding the following namenodes' delegation tokens:null 14/10/11 20:34:55 INFO mapreduce.Job: Running job: job_201410112034_0001 14/10/11 20:34:56 INFO mapreduce.Job: map 0% reduce 0% 14/10/11 20:35:05 INFO mapreduce.Job: map 33% reduce 0% 14/10/11 20:35:08 INFO mapreduce.Job: map 100% reduce 0% 14/10/11 20:35:14 INFO mapreduce.Job: map 100% reduce 11% 14/10/11 20:35:31 INFO mapreduce.Job: Task Id : attempt_201410112034_0001_r_000000_0, Status : FAILED org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#1 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:124) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253) at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:234) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149) 14/10/11 20:35:32 INFO mapreduce.Job: map 100% reduce 0% 14/10/11 20:35:41 INFO mapreduce.Job: map 100% reduce 11% 14/10/11 20:35:49 INFO mapreduce.Job: Task Id : attempt_201410112034_0001_m_000000_0, Status : FAILED Too many fetch-failures 14/10/11 20:35:49 WARN mapreduce.Job: Error reading task outputhttp://userA:50060/tasklog?plaintext=true&attemptid=attempt_201410112034_0001_m_000000_0&filter=stdout 14/10/11 20:35:49 WARN mapreduce.Job: Error reading task outputhttp://userA:50060/tasklog?plaintext=true&attemptid=attempt_201410112034_0001_m_000000_0&filter=stderr 14/10/11 20:36:13 INFO mapreduce.Job: Task Id : attempt_201410112034_0001_r_000000_1, Status : FAILED org.apache.hadoop.mapreduce.task.reduce.Shuffle$ShuffleError: error in shuffle in fetcher#2 at org.apache.hadoop.mapreduce.task.reduce.Shuffle.run(Shuffle.java:124) at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362) at org.apache.hadoop.mapred.Child$4.run(Child.java:217) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:396) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:742) at org.apache.hadoop.mapred.Child.main(Child.java:211) Caused by: java.io.IOException: Exceeded MAX_FAILED_UNIQUE_FETCHES; bailing-out. at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.checkReducerHealth(ShuffleScheduler.java:253) at org.apache.hadoop.mapreduce.task.reduce.ShuffleScheduler.copyFailed(ShuffleScheduler.java:187) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.copyFromHost(Fetcher.java:234) at org.apache.hadoop.mapreduce.task.reduce.Fetcher.run(Fetcher.java:149) 14/10/11 20:36:14 INFO mapreduce.Job: map 100% reduce 0% 14/10/11 20:36:22 INFO mapreduce.Job: Task Id : attempt_201410112034_0001_m_000001_0, Status : FAILED Too many fetch-failures 14/10/11 20:36:22 WARN mapreduce.Job: Error reading task outputhttp://userA:50060/tasklog?plaintext=true&attemptid=attempt_201410112034_0001_m_000001_0&filter=stdout 14/10/11 20:36:22 WARN mapreduce.Job: Error reading task outputhttp://userA:50060/tasklog?plaintext=true&attemptid=attempt_201410112034_0001_m_000001_0&filter=stderr 14/10/11 20:36:23 INFO mapreduce.Job: map 100% reduce 11% 14/10/11 20:36:32 INFO mapreduce.Job: map 100% reduce 100% 14/10/11 20:36:34 INFO mapreduce.Job: Job complete: job_201410112034_0001 14/10/11 20:36:34 INFO mapreduce.Job: Counters: 33 FileInputFormatCounters BYTES_READ=354 FileSystemCounters FILE_BYTES_READ=72 FILE_BYTES_WRITTEN=252 HDFS_BYTES_READ=765 HDFS_BYTES_WRITTEN=215 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=1 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 Job Counters Data-local map tasks=5 Total time spent by all maps waiting after reserving slots (ms)=0 Total time spent by all reduces waiting after reserving slots (ms)=0 SLOTS_MILLIS_MAPS=11950 SLOTS_MILLIS_REDUCES=80809 Launched map tasks=5 Launched reduce tasks=3 Map-Reduce Framework Combine input records=0 Combine output records=0 Failed Shuffles=1 GC time elapsed (ms)=6 Map input records=3 Map output bytes=54 Map output records=6 Merged Map outputs=3 Reduce input groups=2 Reduce input records=6 Reduce output records=0 Reduce shuffle bytes=84 Shuffled Maps =3 Spilled Records=12 SPLIT_RAW_BYTES=411 Job Finished in 100.067 seconds Estimated value of Pi is 3.60000000000000000000
EDIT:
According to sonic's comment. I followed the answer in this question shuffle error:exceeded max_failed_unique_matche : bailing out but when I ping the master with the slave they are working, but when I use curl -I http://slave:50060/ it gives this error curl: (7) couldn't connect to host
I used this to check the port telnet 192.168.0.1 50060
and this was the output:
Trying 192.168.0.1... telnet: Unable to connect to remote host: Connection refused
then I did this: sudo netstat -plntu
and that was the result:
Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:53 0.0.0.0:* LISTEN 1746/dnsmasq tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 693/sshd tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 986/cupsd tcp6 0 0 :::22 :::* LISTEN 693/sshd udp 0 0 127.0.0.1:53 0.0.0.0:* 1746/dnsmasq udp 0 0 0.0.0.0:68 0.0.0.0:* 12396/dhclient udp 0 0 0.0.0.0:5353 0.0.0.0:* 915/avahi-daemon: r udp 0 0 0.0.0.0:46881 0.0.0.0:* 915/avahi-daemon: r udp6 0 0 :::5353 :::* 915/avahi-daemon: r udp6 0 0 :::47491 :::* 915/avahi-daemon: r
so if anyone could please advise.
Here are the files I'm using which are the same on both the master and slave machine: slaves:
master slave
masters:
master
etc/hosts:
192.168.0.1 master 192.168.0.2 slave
Created 11-30-2014 06:03 AM