<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to define a NIFI processor that will unzip  a file that contains files in a directory tree in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/How-to-define-a-NIFI-processor-that-will-unzip-a-file-that/m-p/218452#M180353</link>
    <description>&lt;P&gt;Come on NIFI gurus.. properly unzipping (without losing the zipped directory structure) should be a simple and easy thing to do  in NIFI. &lt;/P&gt;&lt;P&gt;I cant imagine that its as complex as it seems to be here: &lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/191223/how-to-uncompress-a-zip-file-which-has-a-folder-in.html" target="_blank"&gt;https://community.hortonworks.com/questions/191223/how-to-uncompress-a-zip-file-which-has-a-folder-in.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Please advise.&lt;/P&gt;</description>
    <pubDate>Fri, 12 Oct 2018 17:37:04 GMT</pubDate>
    <dc:creator>dave_sargrad</dc:creator>
    <dc:date>2018-10-12T17:37:04Z</dc:date>
    <item>
      <title>How to define a NIFI processor that will unzip  a file that contains files in a directory tree</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-define-a-NIFI-processor-that-will-unzip-a-file-that/m-p/218451#M180352</link>
      <description>&lt;P&gt;I've used the GetHTTP processor to get a zip file from the internet.. I then use PutFile to put this into the file system. I then need to unzip the file .. and preserve the directory structure that the zip file specifies. Can I do this unzip with a NIFI processor? Once unzipped, I will then need to do additional nifi processing on specific files within the original zip file. I tried to use UnpackContent, however its output was a set of flowfiles that lost the directory structure.&lt;/P&gt;&lt;P&gt;Would I need a custom script for this (e.g. use ExecuteScript processor)? Or perhaps I should integrate "Storm" with NIFI to facilitate such an unzip.. that seems overly complex.. and i dont even know that its a proper task for a Storm process.. &lt;/P&gt;&lt;P&gt;Please advise.. I'd think a simple unzip file action.. is .. well simple.&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 13:48:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-define-a-NIFI-processor-that-will-unzip-a-file-that/m-p/218451#M180352</guid>
      <dc:creator>dave_sargrad</dc:creator>
      <dc:date>2022-09-16T13:48:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to define a NIFI processor that will unzip  a file that contains files in a directory tree</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-define-a-NIFI-processor-that-will-unzip-a-file-that/m-p/218452#M180353</link>
      <description>&lt;P&gt;Come on NIFI gurus.. properly unzipping (without losing the zipped directory structure) should be a simple and easy thing to do  in NIFI. &lt;/P&gt;&lt;P&gt;I cant imagine that its as complex as it seems to be here: &lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/questions/191223/how-to-uncompress-a-zip-file-which-has-a-folder-in.html" target="_blank"&gt;https://community.hortonworks.com/questions/191223/how-to-uncompress-a-zip-file-which-has-a-folder-in.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Please advise.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Oct 2018 17:37:04 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-define-a-NIFI-processor-that-will-unzip-a-file-that/m-p/218452#M180353</guid>
      <dc:creator>dave_sargrad</dc:creator>
      <dc:date>2018-10-12T17:37:04Z</dc:date>
    </item>
    <item>
      <title>Re: How to define a NIFI processor that will unzip  a file that contains files in a directory tree</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-define-a-NIFI-processor-that-will-unzip-a-file-that/m-p/218453#M180354</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/98166/davesargrad.html" nodeid="98166" target="_blank"&gt;@David Sargrad&lt;/A&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;The link example you provided in your comment is trying to deal with a zip that contains zipped files (a zip of zips).&lt;/P&gt;&lt;P&gt;If you are talking about a single zip that contains a directory tree with subfiles, this is relatively easy to do.&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;After ingesting your zip file via GetHTTP feed it to an "UnpackContent" processor and then to a "PutFile" processor.&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="92820-screen-shot-2018-10-12-at-84631-am.png" style="width: 1203px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/16073i43A57DC19CB5B967/image-size/medium?v=v2&amp;amp;px=400" role="button" title="92820-screen-shot-2018-10-12-at-84631-am.png" alt="92820-screen-shot-2018-10-12-at-84631-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;When the "UnpackContent" processor unzips the source file, it will create a new FlowFile for each unique file found.  A variety of FlowFile attributes will be set on each of those generated FlowFiles.  This includes the "path"&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="92818-screen-shot-2018-10-12-at-83428-am.png" style="width: 314px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/16074i10A3A3B006F048A5/image-size/medium?v=v2&amp;amp;px=400" role="button" title="92818-screen-shot-2018-10-12-at-83428-am.png" alt="92818-screen-shot-2018-10-12-at-83428-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;In the above example I created a directory named "zip-root" and created 4 sub-directories within that zip-root directory.  I then created one file in each of those subdirectories.  I then zipped (zip -r zip-root.zip zip-root) up the zip-root directory named zip-root.zip. The above screenshots shows just one of those unpacked files.&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;After "UnpackContent" executed, it produced 4 new FlowFile (one for each file found in those sub-directories with in the zip).&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;The "path" FlowFile attribute on each of these generated FlowFiles can be used to maintain the original directory structure when writing out the FlowFiles vi "PutFile" as follows:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="92819-screen-shot-2018-10-12-at-84346-am.png" style="width: 498px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/16075i1F0C7243F379780A/image-size/medium?v=v2&amp;amp;px=400" role="button" title="92819-screen-shot-2018-10-12-at-84346-am.png" alt="92819-screen-shot-2018-10-12-at-84346-am.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;You can see form above configuration that as each FlowFile is processed by the PutFile processor it will place in a directory based on the value assigned to the "path" attribute set on each incoming FlowFile.  Here i decide that my target base directory should be /tmp/target/ and then I preserve/generate the original zipped files directory beneath there.&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;Thank you,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;&lt;P&gt;-&lt;/P&gt;&lt;P&gt;If you found this answer addressed your question, please take a moment to login in and click the "ACCEPT" link.&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 02:50:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-define-a-NIFI-processor-that-will-unzip-a-file-that/m-p/218453#M180354</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2019-08-18T02:50:14Z</dc:date>
    </item>
    <item>
      <title>Re: How to define a NIFI processor that will unzip  a file that contains files in a directory tree</title>
      <link>https://community.cloudera.com/t5/Support-Questions/How-to-define-a-NIFI-processor-that-will-unzip-a-file-that/m-p/218454#M180355</link>
      <description>&lt;P&gt;Thank you. I like your answer very much. I do think the referenced example was not focused on a zip of zip (just a simple zip of a directory tree).. Yet I think your answer is proper.. The "path" attribute does the job. I'll try this.. and thanks.&lt;/P&gt;</description>
      <pubDate>Fri, 12 Oct 2018 20:03:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/How-to-define-a-NIFI-processor-that-will-unzip-a-file-that/m-p/218454#M180355</guid>
      <dc:creator>dave_sargrad</dc:creator>
      <dc:date>2018-10-12T20:03:00Z</dc:date>
    </item>
  </channel>
</rss>

