Support Questions
Find answers, ask questions, and share your expertise

Write MR job output to existing directory

Explorer

Hello,

Is it possible to write MR job output to existing directory without deleting it(incremental write)?

Thanks

Shubham

1 ACCEPTED SOLUTION

It can be done by extending the OutputFormat class and overwriting the OutputFormat.checkOutputSpecs method so that it doesn't throw exception when the output path already exists. After that, register the new class using JobConf.setOutputFormatClass method [some more details here].

View solution in original post

5 REPLIES 5

You can write the following kind of logic in your MR application to make sure that if the directory exist already then delete it first.

Configuration conf = new Configuration(); 
FileSystem fs = FileSystem.get(conf); 
if(fs.exists(new Path(args[1]))) {
      /*If exist delete the output path*/
       fs.delete(new Path(args[1]),true); 
}

Explorer

I do not want to delete existing directory as i mentioned above. I want to write more data to existing directory.

It can be done by extending the OutputFormat class and overwriting the OutputFormat.checkOutputSpecs method so that it doesn't throw exception when the output path already exists. After that, register the new class using JobConf.setOutputFormatClass method [some more details here].

Explorer

thanks!. I will try that.

Explorer

It worked. Thanks!

; ;