Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Write MR job output to existing directory

avatar
Contributor

Hello,

Is it possible to write MR job output to existing directory without deleting it(incremental write)?

Thanks

Shubham

1 ACCEPTED SOLUTION

avatar
Master Guru

It can be done by extending the OutputFormat class and overwriting the OutputFormat.checkOutputSpecs method so that it doesn't throw exception when the output path already exists. After that, register the new class using JobConf.setOutputFormatClass method [some more details here].

View solution in original post

5 REPLIES 5

avatar

You can write the following kind of logic in your MR application to make sure that if the directory exist already then delete it first.

Configuration conf = new Configuration(); 
FileSystem fs = FileSystem.get(conf); 
if(fs.exists(new Path(args[1]))) {
      /*If exist delete the output path*/
       fs.delete(new Path(args[1]),true); 
}

avatar
Contributor

I do not want to delete existing directory as i mentioned above. I want to write more data to existing directory.

avatar
Master Guru

It can be done by extending the OutputFormat class and overwriting the OutputFormat.checkOutputSpecs method so that it doesn't throw exception when the output path already exists. After that, register the new class using JobConf.setOutputFormatClass method [some more details here].

avatar
Contributor

thanks!. I will try that.

avatar
Contributor

It worked. Thanks!