Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

zipping directory in hdfs

zipping directory in hdfs


Dear Colleagues,


there are a lot of files stored in directory on hdfs. I want to create a zip archive including all these files in order to unzip these archive later and all files are present.


How can i achive this? What are your suggestions?


Thanks in advance and best regards,





Re: zipping directory in hdfs

New Contributor
hi recommend to use Hadoop HAR Usage: hadoop archive -archiveName name -p * -archiveName is the name of the archive you would like to create. An example would be foo.har. The name should have a *.har extension. The parent argument is to specify the relative path to which the files should be archived to. Example would be : -p /foo/bar a/b/c e/f/g Here /foo/bar is the parent path and a/b/c, e/f/g are relative paths to parent. Note that this is a Map/Reduce job that creates the archives. You would need a map reduce cluster to run this. For a detailed example the later sections. If you just want to archive a single directory /foo/bar then you can just use hadoop archive -archiveName zoo.har -p /foo/bar /outputdir
Don't have an account?
Coming from Hortonworks? Activate your account here