Support Questions

bandarusridhar1 · ‎08-04-2016

Hi everyone,

I am getting this error when i run TestDFSIO. the job actually finishes successfully. ( according to jobtracker at least ) but this is what i get on the console :

crawler@d1r2n2:/hadoop$ yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-2.7.1.2.3.2.0-2950-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 100000
16/02/08 17:23:43 INFO fs.TestDFSIO: nrFiles = 10
16/02/08 17:23:43 INFO fs.TestDFSIO: fileSize (MB) = 1000
16/02/08 17:23:43 INFO fs.TestDFSIO: bufferSize = 1000000
16/02/08 17:23:43 INFO fs.TestDFSIO: creating control file: 1000 mega 
bytes, 10 files
16/02/08 17:23:44 INFO fs.TestDFSIO: created control files for: 10 files
16/02/08 17:23:44 INFO mapred.FileInputFormat: Total input paths to 
process : 10
16/02/08 17:23:44 INFO mapred.JobClient: Running job: job_201304191712_0002
16/02/08 17:23:45 INFO mapred.JobClient:  map 0% reduce 0%
16/02/08 17:24:06 INFO mapred.JobClient:  map 20% reduce 0%
16/02/08 17:24:07 INFO mapred.JobClient:  map 30% reduce 0%
16/02/08 17:24:09 INFO mapred.JobClient:  map 50% reduce 0%
16/02/08 17:24:11 INFO mapred.JobClient:  map 60% reduce 0%
16/02/08 17:24:12 INFO mapred.JobClient:  map 90% reduce 0%
16/02/08 17:24:13 INFO mapred.JobClient:  map 100% reduce 0%
16/02/08 17:24:21 INFO mapred.JobClient:  map 100% reduce 33%
16/02/08 17:24:22 INFO mapred.JobClient:  map 100% reduce 100%
16/02/08 17:24:23 INFO mapred.JobClient: Job complete: job_201304191712_0002
16/02/08 17:24:23 INFO mapred.JobClient: Counters: 33
16/02/08 17:24:23 INFO mapred.JobClient:   Job Counters
16/02/08 17:24:23 INFO mapred.JobClient:     Launched reduce tasks=1
16/02/08 17:24:23 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=210932
16/02/08 17:24:23 INFO mapred.JobClient:     Total time spent by all 
reduces waiting after reserving slots (ms)=0
16/02/08 17:24:23 INFO mapred.JobClient:     Total time spent by all 
maps waiting after reserving slots (ms)=0
16/02/08 17:24:23 INFO mapred.JobClient:     Rack-local map tasks=2
16/02/08 17:24:23 INFO mapred.JobClient:     Launched map tasks=10
16/02/08 17:24:23 INFO mapred.JobClient:     Data-local map tasks=8
16/02/08 17:24:23 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=8650
16/02/08 17:24:23 INFO mapred.JobClient:   File Input Format Counters
16/02/08 17:24:23 INFO mapred.JobClient:     Bytes Read=1120
16/02/08 17:24:23 INFO mapred.JobClient:   SkippingTaskCounters
16/02/08 17:24:23 INFO mapred.JobClient:     MapProcessedRecords=10
16/02/08 17:24:23 INFO mapred.JobClient:     ReduceProcessedGroups=5
16/02/08 17:24:23 INFO mapred.JobClient:   File Output Format Counters
16/02/08 17:24:23 INFO mapred.JobClient:     Bytes Written=79
16/02/08 17:24:23 INFO mapred.JobClient:   FileSystemCounters
16/02/08 17:24:23 INFO mapred.JobClient:     FILE_BYTES_READ=871
16/02/08 17:24:23 INFO mapred.JobClient:     HDFS_BYTES_READ=2330
16/02/08 17:24:23 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=272508
16/02/08 17:24:23 INFO mapred.JobClient:     HDFS_BYTES_WRITTEN=10485760079
16/02/08 17:24:23 INFO mapred.JobClient:   Map-Reduce Framework
16/02/08 17:24:23 INFO mapred.JobClient:     Map output materialized 
bytes=925
16/02/08 17:24:23 INFO mapred.JobClient:     Map input records=10
16/02/08 17:24:23 INFO mapred.JobClient:     Reduce shuffle bytes=925
16/02/08 17:24:23 INFO mapred.JobClient:     Spilled Records=100
16/02/08 17:24:23 INFO mapred.JobClient:     Map output bytes=765
16/02/08 17:24:23 INFO mapred.JobClient:     Total committed heap usage 
(bytes)=7996702720
16/02/08 17:24:23 INFO mapred.JobClient:     CPU time spent (ms)=104520
16/02/08 17:24:23 INFO mapred.JobClient:     Map input bytes=260
16/02/08 17:24:23 INFO mapred.JobClient:     SPLIT_RAW_BYTES=1210
16/02/08 17:24:23 INFO mapred.JobClient:     Combine input records=0
16/02/08 17:24:23 INFO mapred.JobClient:     Reduce input records=50
16/02/08 17:24:23 INFO mapred.JobClient:     Reduce input groups=5
16/02/08 17:24:23 INFO mapred.JobClient:     Combine output records=0
16/02/08 17:24:23 INFO mapred.JobClient:     Physical memory (bytes) 
snapshot=7111999488
16/02/08 17:24:23 INFO mapred.JobClient:     Reduce output records=5
16/02/08 17:24:23 INFO mapred.JobClient:     Virtual memory (bytes) 
snapshot=28466053120
16/02/08 17:24:23 INFO mapred.JobClient:     Map output records=50
java.io.FileNotFoundException: File does not exist: 
/benchmarks/TestDFSIO/io_write/part-00000
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:1975)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1944)
at 
org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1936)
at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:731)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165)
at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427)
at org.apache.hadoop.fs.TestDFSIO.analyzeResult(TestDFSIO.java:339)
at org.apache.hadoop.fs.TestDFSIO.run(TestDFSIO.java:462)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.hadoop.fs.TestDFSIO.main(TestDFSIO.java:317)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at 
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.hadoop.test.AllTestDriver.main(AllTestDriver.java:81)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)

crawler@d1r2n2:/hadoop$ bin/hadoop fs -ls /benchmarks/TestDFSIO/io_write
Found 3 items
-rw-r--r--   2 crawler supergroup          0 2016-02-08 17:24 
/benchmarks/TestDFSIO/io_write/_SUCCESS
-rw-r--r--   2 crawler supergroup         79 2016-02-08 17:24 
/benchmarks/TestDFSIO/io_write/part-00000.deflate
crawler@d1r2n2:/hadoop$

Does anyone have an idea what might be wrong here?

Thanks in advance.

mqureshi · ‎08-04-2016

@SBandaru

The user who is running this job, does he have permissions to write to this location? Does directory /benchmark/TestDFSIO exists in hdfs?

/benchmarks/TestDFSIO/io_write/part-00000

View solution in original post

mqureshi · ‎08-04-2016

@SBandaru

The user who is running this job, does he have permissions to write to this location? Does directory /benchmark/TestDFSIO exists in hdfs?

/benchmarks/TestDFSIO/io_write/part-00000

bandarusridhar1 · ‎08-04-2016

@mqureshi

Yes, the user is having all the permission. Here my concern is that instead of getting output like this /benchmarks/TestDFSIO/io_write/part-00000 I'm getting output like this /benchmarks/TestDFSIO/io_write/part-00000.deflate which is not correct.

mqureshi · ‎08-04-2016

@SBandaru

Try adding the following to your run

-D mapred.output.compress=false

bandarusridhar1 · ‎08-04-2016

@mqureshi

Perfect, I have missed.

Cloudera Community

Support Questions

TestDFSIO Output Error