Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

"Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 9009" when running a Hadoop streaming job

avatar
New Contributor

This is driving me insane, I kid you not I debugged for like a million times so I can get to this point, but this one I couldn't solve in any way, I'm running a single node cluster, here is the command that I ran:

 

hadoop jar C:\Hadoop\hadoop-2.9.2\hadoop-streaming-2.9.2.jar -verbose -input /hadoop/data.csv -output /hadoop/output.txt -file C:\Hadoop\hadoop-2.9.2\mapper.py -mapper "python mapper.py" -reducer C:\Hadoop\hadoop-2.9.2\reducer.py

 

and here is the error:

22/12/09 01:15:45 WARN streaming.StreamJob: -file option is deprecated, please use generic option -files instead.
packageJobJar: [C:\Hadoop\hadoop-2.9.2\mapper.py, /C:/Users/matsu/AppData/Local/Temp/hadoop-unjar359085693376764278/] [] C:\Users\matsu\AppData\Local\Temp\streamjob5458220146420962625.jar tmpDir=null
22/12/09 01:15:46 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
22/12/09 01:15:47 INFO client.RMProxy: Connecting to ResourceManager at /0.0.0.0:8032
22/12/09 01:15:48 INFO mapred.FileInputFormat: Total input files to process : 1
22/12/09 01:15:48 INFO mapreduce.JobSubmitter: number of splits:2
22/12/09 01:15:48 INFO Configuration.deprecation: yarn.resourcemanager.system-metrics-publisher.enabled is deprecated. Instead, use yarn.system-metrics-publisher.enabled
22/12/09 01:15:48 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1670535169635_0016
22/12/09 01:15:49 INFO impl.YarnClientImpl: Submitted application application_1670535169635_0016
22/12/09 01:15:49 INFO mapreduce.Job: The url to track the job: http://DESKTOP-IAB4BFI:8088/proxy/application_1670535169635_0016/
22/12/09 01:15:49 INFO mapreduce.Job: Running job: job_1670535169635_0016
22/12/09 01:15:56 INFO mapreduce.Job: Job job_1670535169635_0016 running in uber mode : false
22/12/09 01:15:56 INFO mapreduce.Job:  map 0% reduce 0%
22/12/09 01:16:01 INFO mapreduce.Job: Task Id : attempt_1670535169635_0016_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 9009
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)

22/12/09 01:16:01 INFO mapreduce.Job: Task Id : attempt_1670535169635_0016_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 9009
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)

22/12/09 01:16:06 INFO mapreduce.Job: Task Id : attempt_1670535169635_0016_m_000001_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 9009
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)

22/12/09 01:16:06 INFO mapreduce.Job: Task Id : attempt_1670535169635_0016_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 9009
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)

22/12/09 01:16:11 INFO mapreduce.Job: Task Id : attempt_1670535169635_0016_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 9009
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)

22/12/09 01:16:12 INFO mapreduce.Job: Task Id : attempt_1670535169635_0016_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 9009
        at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
        at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
        at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
        at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:177)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1893)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:171)

22/12/09 01:16:17 INFO mapreduce.Job:  map 100% reduce 100%
22/12/09 01:16:18 INFO mapreduce.Job: Job job_1670535169635_0016 failed with state FAILED due to: Task failed task_1670535169635_0016_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

22/12/09 01:16:18 INFO mapreduce.Job: Counters: 14
        Job Counters
                Failed map tasks=7
                Killed map tasks=1
                Killed reduce tasks=1
                Launched map tasks=8
                Other local map tasks=6
                Data-local map tasks=2
                Total time spent by all maps in occupied slots (ms)=28036
                Total time spent by all reduces in occupied slots (ms)=0
                Total time spent by all map tasks (ms)=28036
                Total vcore-milliseconds taken by all map tasks=28036
                Total megabyte-milliseconds taken by all map tasks=28708864
        Map-Reduce Framework
                CPU time spent (ms)=0
                Physical memory (bytes) snapshot=0
                Virtual memory (bytes) snapshot=0
22/12/09 01:16:18 ERROR streaming.StreamJob: Job not successful!
Streaming Command Failed!


When I try running the command with no -mapper argument it works fine, so that means that the script is what's probably the problem, here is my mapper script:

#!C:\Users\matsu\AppData\Local\Microsoft\WindowsApps/python.exe


import sys

# read input from stdin
for line in sys.stdin:
    # split the input line into user ID, movie ID, and rating
    user_id, movie_id, rating = line.strip().split(",")

    # emit the user ID and movie ID as key-value pairs
    print("{}\t{}".format(user_id, movie_id))

I tried removing the first line, changing it, did everything, change what the script does just to test, but it always gives me the same error.
Any miniscule amount of help would be much much appreciated!

0 REPLIES 0