Posts: 21
Registered: ‎10-18-2017

compression for an example mapreduce job

Dear community,

if I want to run the teragen program with output compression, is this the correct command:

sudo -u hdfs hadoop jar hadoop-mapreduce-examples-2.6.0-cdh5.13.0.jar teragen



-D mapreduce.output.fileoutputformat.compress=true 1000 /user/dev/teragen


Is the following correct? First option sets intermediate compression, 2nd option specifies it needs to be zipped compression, third option would ensure also the output is zipped. I have seen multiple commands some of them deprecated. I notice that my output of the terragen is still not zipped so something is still not correct