Support Questions

Find answers, ask questions, and share your expertise

Write MR job output to existing directory

avatar
New Member

Hello,

Is it possible to write MR job output to existing directory without deleting it(incremental write)?

Thanks

Shubham

1 ACCEPTED SOLUTION

avatar
Master Guru

It can be done by extending the OutputFormat class and overwriting the OutputFormat.checkOutputSpecs method so that it doesn't throw exception when the output path already exists. After that, register the new class using JobConf.setOutputFormatClass method [some more details here].

View solution in original post

5 REPLIES 5

avatar
Not applicable

You can write the following kind of logic in your MR application to make sure that if the directory exist already then delete it first.

Configuration conf = new Configuration(); 
FileSystem fs = FileSystem.get(conf); 
if(fs.exists(new Path(args[1]))) {
      /*If exist delete the output path*/
       fs.delete(new Path(args[1]),true); 
}

avatar
New Member

I do not want to delete existing directory as i mentioned above. I want to write more data to existing directory.

avatar
Master Guru

It can be done by extending the OutputFormat class and overwriting the OutputFormat.checkOutputSpecs method so that it doesn't throw exception when the output path already exists. After that, register the new class using JobConf.setOutputFormatClass method [some more details here].

avatar
New Member

thanks!. I will try that.

avatar
New Member

It worked. Thanks!