<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Apache Nifi - Using PutParquet, the HDFS file format transferred remains native (.txt) in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Apache-Nifi-Using-PutParquet-the-HDFS-file-format/m-p/309257#M223802</link>
    <description>&lt;P&gt;&lt;SPAN&gt;I used the PutParquet processor with CSVReader to compress files.txt in parquet format and moving them to HDFS.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Checking from the Browse Directory of Hadoop however the files keep &lt;STRONG&gt;.txt&lt;/STRONG&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Is it normal? Are they saved in parquet format?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thank you!&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/73798"&gt;@ApacheNifi&lt;/a&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Configuration of PutParquet processor:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Schermata 2021-01-10 alle 19.22.35.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30084i861DD97654D910D3/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Schermata 2021-01-10 alle 19.22.35.png" alt="Schermata 2021-01-10 alle 19.22.35.png" /&gt;&lt;/span&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Schermata 2021-01-10 alle 19.22.29.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30083iD0E44F1B359F59AC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Schermata 2021-01-10 alle 19.22.29.png" alt="Schermata 2021-01-10 alle 19.22.29.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Hadoop:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Schermata 2021-01-10 alle 19.21.29.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30082i1A3ABE00C0269AC5/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Schermata 2021-01-10 alle 19.21.29.png" alt="Schermata 2021-01-10 alle 19.21.29.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
    <pubDate>Sun, 10 Jan 2021 18:30:22 GMT</pubDate>
    <dc:creator>Lallagreta</dc:creator>
    <dc:date>2021-01-10T18:30:22Z</dc:date>
    <item>
      <title>Apache Nifi - Using PutParquet, the HDFS file format transferred remains native (.txt)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Apache-Nifi-Using-PutParquet-the-HDFS-file-format/m-p/309257#M223802</link>
      <description>&lt;P&gt;&lt;SPAN&gt;I used the PutParquet processor with CSVReader to compress files.txt in parquet format and moving them to HDFS.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Checking from the Browse Directory of Hadoop however the files keep &lt;STRONG&gt;.txt&lt;/STRONG&gt; &lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Is it normal? Are they saved in parquet format?&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Thank you!&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/73798"&gt;@ApacheNifi&lt;/a&gt;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;Configuration of PutParquet processor:&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Schermata 2021-01-10 alle 19.22.35.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30084i861DD97654D910D3/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Schermata 2021-01-10 alle 19.22.35.png" alt="Schermata 2021-01-10 alle 19.22.35.png" /&gt;&lt;/span&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Schermata 2021-01-10 alle 19.22.29.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30083iD0E44F1B359F59AC/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Schermata 2021-01-10 alle 19.22.29.png" alt="Schermata 2021-01-10 alle 19.22.29.png" /&gt;&lt;/span&gt;&lt;/P&gt;&lt;P&gt;Hadoop:&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="Schermata 2021-01-10 alle 19.21.29.png" style="width: 400px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/30082i1A3ABE00C0269AC5/image-size/medium?v=v2&amp;amp;px=400" role="button" title="Schermata 2021-01-10 alle 19.21.29.png" alt="Schermata 2021-01-10 alle 19.21.29.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 10 Jan 2021 18:30:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Apache-Nifi-Using-PutParquet-the-HDFS-file-format/m-p/309257#M223802</guid>
      <dc:creator>Lallagreta</dc:creator>
      <dc:date>2021-01-10T18:30:22Z</dc:date>
    </item>
    <item>
      <title>Re: Apache Nifi - Using PutParquet, the HDFS file format transferred remains native (.txt)</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Apache-Nifi-Using-PutParquet-the-HDFS-file-format/m-p/309349#M223813</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/84355"&gt;@Lallagreta&lt;/a&gt;&amp;nbsp; &amp;nbsp;You should be able to define the filename, or change the filename to what you want.&amp;nbsp; That said the filename doesnt dictate the type,&amp;nbsp; so you can have parquet saved as .txt.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;One recommendation I have is to use parquet command line tools during the testing of your use case.&amp;nbsp; This is the best way to validate that files are looking right, have the right schema, and right results.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;A href="https://pypi.org/project/parquet-tools/" target="_blank"&gt;https://pypi.org/project/parquet-tools/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I apologize i do not have any exact samples, but from my recall of a year ago,&amp;nbsp; you should be able to get simple commands to check schema of a file, and another command to show the data results.&amp;nbsp; &amp;nbsp;You may have to copy your hdfs file to local file system to inspect them from command line.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If this answer resolves your issue or allows you to move forward, please choose to ACCEPT this solution and close this topic. If you have further dialogue on this topic please comment here or feel free to private message me. If you have new questions related to your Use Case please create separate topic and feel free to tag me in your post.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Steven&lt;/P&gt;</description>
      <pubDate>Mon, 11 Jan 2021 13:52:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Apache-Nifi-Using-PutParquet-the-HDFS-file-format/m-p/309349#M223813</guid>
      <dc:creator>stevenmatison</dc:creator>
      <dc:date>2021-01-11T13:52:43Z</dc:date>
    </item>
  </channel>
</rss>

