<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: HDFS Directory Structure Best Practices in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Directory-Structure-Best-Practices/m-p/49540#M51773</link>
    <description>Eric Sammer (author of Hadoop Operations) has written a great answer about the same here:&lt;BR /&gt;&lt;BR /&gt;&lt;A href="https://www.quora.com/What-is-the-best-directory-structure-to-store-different-types-of-logs-in-HDFS-over-the-time/answer/Eric-Sammer" target="_blank"&gt;https://www.quora.com/What-is-the-best-directory-structure-to-store-different-types-of-logs-in-HDFS-over-the-time/answer/Eric-Sammer&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Hadoop Operations is a great book and has quite a few good tricks.</description>
    <pubDate>Tue, 17 Jan 2017 21:01:12 GMT</pubDate>
    <dc:creator>surajacharya</dc:creator>
    <dc:date>2017-01-17T21:01:12Z</dc:date>
    <item>
      <title>HDFS Directory Structure Best Practices</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Directory-Structure-Best-Practices/m-p/49538#M51772</link>
      <description>&lt;P&gt;Hi-&lt;/P&gt;&lt;P&gt;&amp;nbsp;&amp;nbsp; Can someone point me to a good resource for "best practices" for a hadoop directory structure for storing raw data, intermediate files, output files, metadata etc in HDFS?&amp;nbsp;&amp;nbsp;&amp;nbsp;Do you segregate different data types into different directory structures?&amp;nbsp;&amp;nbsp;&amp;nbsp;Are the directory structures labeled per&amp;nbsp;YYMMDD?&amp;nbsp; What would a typical HDFS directory structure look like when setting up to store data?&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 10:55:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Directory-Structure-Best-Practices/m-p/49538#M51772</guid>
      <dc:creator>plandis</dc:creator>
      <dc:date>2022-09-16T10:55:11Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Directory Structure Best Practices</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Directory-Structure-Best-Practices/m-p/49540#M51773</link>
      <description>Eric Sammer (author of Hadoop Operations) has written a great answer about the same here:&lt;BR /&gt;&lt;BR /&gt;&lt;A href="https://www.quora.com/What-is-the-best-directory-structure-to-store-different-types-of-logs-in-HDFS-over-the-time/answer/Eric-Sammer" target="_blank"&gt;https://www.quora.com/What-is-the-best-directory-structure-to-store-different-types-of-logs-in-HDFS-over-the-time/answer/Eric-Sammer&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;Hadoop Operations is a great book and has quite a few good tricks.</description>
      <pubDate>Tue, 17 Jan 2017 21:01:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Directory-Structure-Best-Practices/m-p/49540#M51773</guid>
      <dc:creator>surajacharya</dc:creator>
      <dc:date>2017-01-17T21:01:12Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS Directory Structure Best Practices</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Directory-Structure-Best-Practices/m-p/85475#M51774</link>
      <description>&lt;P&gt;It could be depends on data layers in your HDFS directory, for instance, if you have &lt;EM&gt;raw&lt;/EM&gt; and &lt;EM&gt;standard&lt;/EM&gt; layer this would be one of the practices.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;Raw&lt;/EM&gt; is the first landing of data and need to be as close to the original data as possible. &lt;EM&gt;Standard &lt;/EM&gt;is the staging of the data where it converted into different data formats and still no semantic changed have been done to data.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;the structure for raw data and meta is :&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &lt;STRONG&gt;raw&lt;/STRONG&gt;/businessarea/sourcesystem/&lt;STRONG&gt;data&lt;/STRONG&gt;/date&amp;amp;time&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&lt;STRONG&gt;raw&lt;/STRONG&gt;/businessarea/sourcesystem/&lt;STRONG&gt;meta&lt;/STRONG&gt;/date&amp;amp;time&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;the&amp;nbsp;structure of standard data/meta folder is: &lt;STRONG&gt;standard&lt;/STRONG&gt;&lt;SPAN&gt;/businessarea/sourcesystem/&lt;STRONG&gt;data&lt;/STRONG&gt;/date&amp;amp;time&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&lt;SPAN&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp;&amp;nbsp;&lt;STRONG&gt;standard&lt;/STRONG&gt;/businessarea/sourcesystem/&lt;STRONG&gt;meta&lt;/STRONG&gt;/date&amp;amp;time&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;these standards can also help to make&amp;nbsp; sentry/ranger policies based AD groups&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 25 Jan 2019 12:55:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/HDFS-Directory-Structure-Best-Practices/m-p/85475#M51774</guid>
      <dc:creator>Karun</dc:creator>
      <dc:date>2019-01-25T12:55:22Z</dc:date>
    </item>
  </channel>
</rss>

