- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Distcp compression not working
- Labels:
-
Apache Hadoop
Created 10-05-2016 05:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I am trying to compress the output of distcp but it doesn't compress the output . pls help me to compress it . below is the command using
hadoop distcp -D mapreduce.output.fileoutputformat.compress=true -D mapreduce.output.fileoutputformat.compress.codec=org.apache.hadoop.io.compress.BZip2Codec inputdir outputdir
Created 10-05-2016 07:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This feature is still not available in Hadoop by default. You can add a patch but distcp doesn't compress data. following JIRA will give you all the details including the patch you want to download.
https://issues.apache.org/jira/browse/HADOOP-8065
Following is the new JIRA
https://issues.apache.org/jira/browse/HADOOP-13114 --> use this one if you decide to apply the patch.
Created 10-05-2016 01:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Arun Reddy Which version of hdp you are using ?
Created 10-05-2016 06:46 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Apache raw version Hadoop 2.7.1
Created 10-05-2016 07:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This feature is still not available in Hadoop by default. You can add a patch but distcp doesn't compress data. following JIRA will give you all the details including the patch you want to download.
https://issues.apache.org/jira/browse/HADOOP-8065
Following is the new JIRA
https://issues.apache.org/jira/browse/HADOOP-13114 --> use this one if you decide to apply the patch.
Created 10-06-2016 06:07 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thank you @mqureshi . how about HDP 2.4 .does that patch included in HDP 2.4 and above ?
Created 10-06-2016 04:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Negative. If you check the Jira's, they are unresolved. We don't ship unresolved issues in our product. So, your only option right now is to download the patch and apply to your installation. That will affect support if you have that because you are applying a non hortonworks patch.
I would suggest that you simply distcp the file and then compress it. You are only saving a step. It's not saving you any time or giving better performance.
Created 10-07-2016 06:20 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for your time . I am bringing the dir to local and applying compression
