Created 04-06-2016 06:27 AM
Hello,
Is it possible to write MR job output to existing directory without deleting it(incremental write)?
Thanks
Shubham
Created 04-06-2016 07:33 AM
It can be done by extending the OutputFormat class and overwriting the OutputFormat.checkOutputSpecs method so that it doesn't throw exception when the output path already exists. After that, register the new class using JobConf.setOutputFormatClass method [some more details here].
Created 04-06-2016 06:49 AM
You can write the following kind of logic in your MR application to make sure that if the directory exist already then delete it first.
Configuration conf = new Configuration(); FileSystem fs = FileSystem.get(conf); if(fs.exists(new Path(args[1]))) { /*If exist delete the output path*/ fs.delete(new Path(args[1]),true); }
Created 04-06-2016 07:07 AM
I do not want to delete existing directory as i mentioned above. I want to write more data to existing directory.
Created 04-06-2016 07:33 AM
It can be done by extending the OutputFormat class and overwriting the OutputFormat.checkOutputSpecs method so that it doesn't throw exception when the output path already exists. After that, register the new class using JobConf.setOutputFormatClass method [some more details here].
Created 04-06-2016 08:42 AM
thanks!. I will try that.
Created 04-11-2016 05:24 PM
It worked. Thanks!