<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Sandbox HDFS Replication Set to 3 - Why? in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sandbox-HDFS-Replication-Set-to-3-Why/m-p/138109#M23427</link>
    <description>&lt;A rel="user" href="https://community.cloudera.com/users/3584/rcicak.html" nodeid="3584"&gt;@rcicak&lt;/A&gt;&lt;P&gt;Yes you are right, it does not make sense to have a 3x replication. It is a  default so it is set to 3. I have thoughts about it. &lt;/P&gt;&lt;P&gt;But the other way of looking at replication is if you are going after the same table and a node is busy, which does not apply in this case exactly, you can run the same query on another node where the replication is available. &lt;/P&gt;&lt;P&gt;I would leave it to 3, incase someone add more nodes to the VMs, the data gets replicated correctly.&lt;/P&gt;</description>
    <pubDate>Tue, 22 Mar 2016 09:05:32 GMT</pubDate>
    <dc:creator>sdutta</dc:creator>
    <dc:date>2016-03-22T09:05:32Z</dc:date>
    <item>
      <title>Sandbox HDFS Replication Set to 3 - Why?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sandbox-HDFS-Replication-Set-to-3-Why/m-p/138107#M23425</link>
      <description>&lt;P&gt;While running the latest Sandbox (HDP 2.4 on Hortonworks Sandbox), I noticed HDFS had 500+ under replicated blocks (via Ambari).  Opening /etc/hadoop/conf/hdfs-site.xml,  dfs.replication=3 (default &lt;A href="http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml)" target="_blank"&gt;http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml)&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Does anyone know why the Sandbox uses a HDFS replication factor of 3, aside from the fact that its the HDFS default?  I'd assume most Sandbox users are running a virtual machine representing one node.  If this is the case, dfs.replication=1 in the Sandbox to prevent under replicated blocks.  Is my assumption incorrect?  &lt;/P&gt;</description>
      <pubDate>Tue, 22 Mar 2016 04:42:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sandbox-HDFS-Replication-Set-to-3-Why/m-p/138107#M23425</guid>
      <dc:creator>RyanCicak</dc:creator>
      <dc:date>2016-03-22T04:42:31Z</dc:date>
    </item>
    <item>
      <title>Re: Sandbox HDFS Replication Set to 3 - Why?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sandbox-HDFS-Replication-Set-to-3-Why/m-p/138108#M23426</link>
      <description>&lt;P&gt;I will escalate this thank you for bringing this up.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Mar 2016 05:39:33 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sandbox-HDFS-Replication-Set-to-3-Why/m-p/138108#M23426</guid>
      <dc:creator>aervits</dc:creator>
      <dc:date>2016-03-22T05:39:33Z</dc:date>
    </item>
    <item>
      <title>Re: Sandbox HDFS Replication Set to 3 - Why?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sandbox-HDFS-Replication-Set-to-3-Why/m-p/138109#M23427</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/3584/rcicak.html" nodeid="3584"&gt;@rcicak&lt;/A&gt;&lt;P&gt;Yes you are right, it does not make sense to have a 3x replication. It is a  default so it is set to 3. I have thoughts about it. &lt;/P&gt;&lt;P&gt;But the other way of looking at replication is if you are going after the same table and a node is busy, which does not apply in this case exactly, you can run the same query on another node where the replication is available. &lt;/P&gt;&lt;P&gt;I would leave it to 3, incase someone add more nodes to the VMs, the data gets replicated correctly.&lt;/P&gt;</description>
      <pubDate>Tue, 22 Mar 2016 09:05:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sandbox-HDFS-Replication-Set-to-3-Why/m-p/138109#M23427</guid>
      <dc:creator>sdutta</dc:creator>
      <dc:date>2016-03-22T09:05:32Z</dc:date>
    </item>
    <item>
      <title>Re: Sandbox HDFS Replication Set to 3 - Why?</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sandbox-HDFS-Replication-Set-to-3-Why/m-p/138110#M23428</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/3584/rcicak.html" nodeid="3584"&gt;@Ryan Cicak&lt;/A&gt; The sandbox provides many of the defaults used during normal installation.  You can change the 3x replication in the configs but the sandbox is mainly to allow usage of the tutorials. &lt;/P&gt;</description>
      <pubDate>Wed, 13 Apr 2016 19:07:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/Sandbox-HDFS-Replication-Set-to-3-Why/m-p/138110#M23428</guid>
      <dc:creator>iroberts</dc:creator>
      <dc:date>2016-04-13T19:07:19Z</dc:date>
    </item>
  </channel>
</rss>

