Support Questions

Find answers, ask questions, and share your expertise

Distcp compression not working

avatar

Hi, I am trying to compress the output of distcp but it doesn't compress the output . pls help me to compress it . below is the command using

hadoop distcp -D mapreduce.output.fileoutputformat.compress=true -D mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec inputdir outputdir

1 ACCEPTED SOLUTION

avatar
Super Guru
@Arun Reddy

This feature is still not available in Hadoop by default. You can add a patch but distcp doesn't compress data. following JIRA will give you all the details including the patch you want to download.

https://issues.apache.org/jira/browse/HADOOP-8065

Following is the new JIRA

https://issues.apache.org/jira/browse/HADOOP-13114 --> use this one if you decide to apply the patch.

View solution in original post

6 REPLIES 6

avatar
Guru

@Arun Reddy Which version of hdp you are using ?

avatar

Apache raw version Hadoop 2.7.1

avatar
Super Guru
@Arun Reddy

This feature is still not available in Hadoop by default. You can add a patch but distcp doesn't compress data. following JIRA will give you all the details including the patch you want to download.

https://issues.apache.org/jira/browse/HADOOP-8065

Following is the new JIRA

https://issues.apache.org/jira/browse/HADOOP-13114 --> use this one if you decide to apply the patch.

avatar

Thank you @mqureshi . how about HDP 2.4 .does that patch included in HDP 2.4 and above ?

avatar
Super Guru

Negative. If you check the Jira's, they are unresolved. We don't ship unresolved issues in our product. So, your only option right now is to download the patch and apply to your installation. That will affect support if you have that because you are applying a non hortonworks patch.

I would suggest that you simply distcp the file and then compress it. You are only saving a step. It's not saving you any time or giving better performance.

avatar

Thanks for your time . I am bringing the dir to local and applying compression