Created 08-04-2016 01:38 AM
Hi everyone,
I am getting this error when i run TestDFSIO. the job actually finishes successfully. ( according to jobtracker at least ) but this is what i get on the console :
crawler@d1r2n2:/hadoop$ yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-client-jobclient-2.7.1.2.3.2.0-2950-tests.jar TestDFSIO -write -nrFiles 10 -fileSize 100000 16/02/08 17:23:43 INFO fs.TestDFSIO: nrFiles = 10 16/02/08 17:23:43 INFO fs.TestDFSIO: fileSize (MB) = 1000 16/02/08 17:23:43 INFO fs.TestDFSIO: bufferSize = 1000000 16/02/08 17:23:43 INFO fs.TestDFSIO: creating control file: 1000 mega bytes, 10 files 16/02/08 17:23:44 INFO fs.TestDFSIO: created control files for: 10 files 16/02/08 17:23:44 INFO mapred.FileInputFormat: Total input paths to process : 10 16/02/08 17:23:44 INFO mapred.JobClient: Running job: job_201304191712_0002 16/02/08 17:23:45 INFO mapred.JobClient: map 0% reduce 0% 16/02/08 17:24:06 INFO mapred.JobClient: map 20% reduce 0% 16/02/08 17:24:07 INFO mapred.JobClient: map 30% reduce 0% 16/02/08 17:24:09 INFO mapred.JobClient: map 50% reduce 0% 16/02/08 17:24:11 INFO mapred.JobClient: map 60% reduce 0% 16/02/08 17:24:12 INFO mapred.JobClient: map 90% reduce 0% 16/02/08 17:24:13 INFO mapred.JobClient: map 100% reduce 0% 16/02/08 17:24:21 INFO mapred.JobClient: map 100% reduce 33% 16/02/08 17:24:22 INFO mapred.JobClient: map 100% reduce 100% 16/02/08 17:24:23 INFO mapred.JobClient: Job complete: job_201304191712_0002 16/02/08 17:24:23 INFO mapred.JobClient: Counters: 33 16/02/08 17:24:23 INFO mapred.JobClient: Job Counters 16/02/08 17:24:23 INFO mapred.JobClient: Launched reduce tasks=1 16/02/08 17:24:23 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=210932 16/02/08 17:24:23 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0 16/02/08 17:24:23 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0 16/02/08 17:24:23 INFO mapred.JobClient: Rack-local map tasks=2 16/02/08 17:24:23 INFO mapred.JobClient: Launched map tasks=10 16/02/08 17:24:23 INFO mapred.JobClient: Data-local map tasks=8 16/02/08 17:24:23 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=8650 16/02/08 17:24:23 INFO mapred.JobClient: File Input Format Counters 16/02/08 17:24:23 INFO mapred.JobClient: Bytes Read=1120 16/02/08 17:24:23 INFO mapred.JobClient: SkippingTaskCounters 16/02/08 17:24:23 INFO mapred.JobClient: MapProcessedRecords=10 16/02/08 17:24:23 INFO mapred.JobClient: ReduceProcessedGroups=5 16/02/08 17:24:23 INFO mapred.JobClient: File Output Format Counters 16/02/08 17:24:23 INFO mapred.JobClient: Bytes Written=79 16/02/08 17:24:23 INFO mapred.JobClient: FileSystemCounters 16/02/08 17:24:23 INFO mapred.JobClient: FILE_BYTES_READ=871 16/02/08 17:24:23 INFO mapred.JobClient: HDFS_BYTES_READ=2330 16/02/08 17:24:23 INFO mapred.JobClient: FILE_BYTES_WRITTEN=272508 16/02/08 17:24:23 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=10485760079 16/02/08 17:24:23 INFO mapred.JobClient: Map-Reduce Framework 16/02/08 17:24:23 INFO mapred.JobClient: Map output materialized bytes=925 16/02/08 17:24:23 INFO mapred.JobClient: Map input records=10 16/02/08 17:24:23 INFO mapred.JobClient: Reduce shuffle bytes=925 16/02/08 17:24:23 INFO mapred.JobClient: Spilled Records=100 16/02/08 17:24:23 INFO mapred.JobClient: Map output bytes=765 16/02/08 17:24:23 INFO mapred.JobClient: Total committed heap usage (bytes)=7996702720 16/02/08 17:24:23 INFO mapred.JobClient: CPU time spent (ms)=104520 16/02/08 17:24:23 INFO mapred.JobClient: Map input bytes=260 16/02/08 17:24:23 INFO mapred.JobClient: SPLIT_RAW_BYTES=1210 16/02/08 17:24:23 INFO mapred.JobClient: Combine input records=0 16/02/08 17:24:23 INFO mapred.JobClient: Reduce input records=50 16/02/08 17:24:23 INFO mapred.JobClient: Reduce input groups=5 16/02/08 17:24:23 INFO mapred.JobClient: Combine output records=0 16/02/08 17:24:23 INFO mapred.JobClient: Physical memory (bytes) snapshot=7111999488 16/02/08 17:24:23 INFO mapred.JobClient: Reduce output records=5 16/02/08 17:24:23 INFO mapred.JobClient: Virtual memory (bytes) snapshot=28466053120 16/02/08 17:24:23 INFO mapred.JobClient: Map output records=50 java.io.FileNotFoundException: File does not exist: /benchmarks/TestDFSIO/io_write/part-00000 at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.fetchLocatedBlocks(DFSClient.java:1975) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.openInfo(DFSClient.java:1944) at org.apache.hadoop.hdfs.DFSClient$DFSInputStream.<init>(DFSClient.java:1936) at org.apache.hadoop.hdfs.DFSClient.open(DFSClient.java:731) at org.apache.hadoop.hdfs.DistributedFileSystem.open(DistributedFileSystem.java:165) at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:427) at org.apache.hadoop.fs.TestDFSIO.analyzeResult(TestDFSIO.java:339) at org.apache.hadoop.fs.TestDFSIO.run(TestDFSIO.java:462) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65) at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79) at org.apache.hadoop.fs.TestDFSIO.main(TestDFSIO.java:317) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68) at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139) at org.apache.hadoop.test.AllTestDriver.main(AllTestDriver.java:81) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source) at java.lang.reflect.Method.invoke(Unknown Source) at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
crawler@d1r2n2:/hadoop$ bin/hadoop fs -ls /benchmarks/TestDFSIO/io_write Found 3 items -rw-r--r-- 2 crawler supergroup 0 2016-02-08 17:24 /benchmarks/TestDFSIO/io_write/_SUCCESS -rw-r--r-- 2 crawler supergroup 79 2016-02-08 17:24 /benchmarks/TestDFSIO/io_write/part-00000.deflate crawler@d1r2n2:/hadoop$
Does anyone have an idea what might be wrong here?
Thanks in advance.
Created 08-04-2016 01:52 AM
The user who is running this job, does he have permissions to write to this location? Does directory /benchmark/TestDFSIO exists in hdfs?
/benchmarks/TestDFSIO/io_write/part-00000
Created 08-04-2016 01:52 AM
The user who is running this job, does he have permissions to write to this location? Does directory /benchmark/TestDFSIO exists in hdfs?
/benchmarks/TestDFSIO/io_write/part-00000
Created 08-04-2016 02:22 AM
Yes, the user is having all the permission. Here my concern is that instead of getting output like this /benchmarks/TestDFSIO/io_write/part-00000 I'm getting output like this /benchmarks/TestDFSIO/io_write/part-00000.deflate which is not correct.
Created 08-04-2016 02:35 AM
Created 08-04-2016 02:45 AM
Perfect, I have missed.