<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: AWS S3 bucket as a primary storage for HDFS in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/AWS-S3-bucket-as-a-primary-storage-for-HDFS/m-p/46334#M43624</link>
    <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I don't think that's possible given that most applications are based on HDFS semantics (strong consistency, POSIX compatible), and S3 simply isn't designed as a file system (eventual consistency, blob store). Plus, you lose data locality.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As far as I know, most cloud use cases still use HDFS as temporary, intermediate storage, and use S3 as permanent, eventual storage.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;There've been several studies in using HDFS as meta store, and cloud as data store, but that's a huge work (see HDFS-9806) and probably in the Hadoop 4/CDH 7 timeframe.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
    <pubDate>Sun, 16 Oct 2016 17:13:50 GMT</pubDate>
    <dc:creator>weichiu</dc:creator>
    <dc:date>2016-10-16T17:13:50Z</dc:date>
    <item>
      <title>AWS S3 bucket as a primary storage for HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/AWS-S3-bucket-as-a-primary-storage-for-HDFS/m-p/46325#M43621</link>
      <description>&lt;P&gt;Hi Guys,&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I was looking for some information about implementation of the S3 bucket as primary storage for HDFS. Has someone done sth like that? What are pro and cons of such solution?&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Thanks,&lt;/P&gt;
&lt;P&gt;Andrzej&lt;/P&gt;</description>
      <pubDate>Sun, 16 Oct 2016 12:16:44 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/AWS-S3-bucket-as-a-primary-storage-for-HDFS/m-p/46325#M43621</guid>
      <dc:creator>andrzej_jedrzej</dc:creator>
      <dc:date>2016-10-16T12:16:44Z</dc:date>
    </item>
    <item>
      <title>Re: AWS S3 bucket as a primary storage for HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/AWS-S3-bucket-as-a-primary-storage-for-HDFS/m-p/46331#M43622</link>
      <description>Do you mean "Use of S3 instead of HDFS?" which would be a good idea for some cloud-env clusters such as those Cloudera Director helps run.&lt;BR /&gt;&lt;BR /&gt;Keep an eye out for our upcoming 5.9 release too, where several further Cloud environment enhancements (incl. better S3 support) are forthcoming.</description>
      <pubDate>Sun, 16 Oct 2016 15:10:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/AWS-S3-bucket-as-a-primary-storage-for-HDFS/m-p/46331#M43622</guid>
      <dc:creator>Harsh J</dc:creator>
      <dc:date>2016-10-16T15:10:33Z</dc:date>
    </item>
    <item>
      <title>Re: AWS S3 bucket as a primary storage for HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/AWS-S3-bucket-as-a-primary-storage-for-HDFS/m-p/46333#M43623</link>
      <description>&lt;P&gt;I would like to ingest all my data into S3 and make it a primary storage layer (not a backup). It would be a cloud-based env, e.g. deploy within Cloudera Director. Is it possible to specify during deploying a type of storage?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I would like to run YARN, SPARK, OOZIE jobs.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Sun, 16 Oct 2016 16:23:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/AWS-S3-bucket-as-a-primary-storage-for-HDFS/m-p/46333#M43623</guid>
      <dc:creator>andrzej_jedrzej</dc:creator>
      <dc:date>2016-10-16T16:23:52Z</dc:date>
    </item>
    <item>
      <title>Re: AWS S3 bucket as a primary storage for HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/AWS-S3-bucket-as-a-primary-storage-for-HDFS/m-p/46334#M43624</link>
      <description>&lt;P&gt;Hi,&lt;/P&gt;&lt;P&gt;I don't think that's possible given that most applications are based on HDFS semantics (strong consistency, POSIX compatible), and S3 simply isn't designed as a file system (eventual consistency, blob store). Plus, you lose data locality.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As far as I know, most cloud use cases still use HDFS as temporary, intermediate storage, and use S3 as permanent, eventual storage.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;There've been several studies in using HDFS as meta store, and cloud as data store, but that's a huge work (see HDFS-9806) and probably in the Hadoop 4/CDH 7 timeframe.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Hope this helps.&lt;/P&gt;</description>
      <pubDate>Sun, 16 Oct 2016 17:13:50 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/AWS-S3-bucket-as-a-primary-storage-for-HDFS/m-p/46334#M43624</guid>
      <dc:creator>weichiu</dc:creator>
      <dc:date>2016-10-16T17:13:50Z</dc:date>
    </item>
  </channel>
</rss>

