<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question how to identify the problem about under replica blocks in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/how-to-identify-the-problem-about-under-replica-blocks/m-p/309858#M223946</link>
    <description>&lt;P&gt;we installed small HDP cluster with one data-node machine&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;HDP version is `2.6.5` and ambari version is `2.6.1`&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;so this is new cluster that contain two name-node with only one data-node ( worker machine )&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;the&amp;nbsp;interesting behavior that we see is that increasing of `under replica` on ambari dashboard , for now the number is `15000` under replica blocks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;as we know the most root cause of this problem is network issues between name node to data-node&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;but this isn't the case in our hadoop cluster&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;we can also decrease the under replica by the following procedure&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;su - &amp;lt;$hdfs_user&amp;gt;&lt;BR /&gt;&lt;BR /&gt;bash-4.1$ hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' &amp;gt;&amp;gt; /tmp/under_replicated_files&lt;BR /&gt;&lt;BR /&gt;-bash-4.1$ for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ; hadoop fs -setrep 3 $hdfsfile; done&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;but we not want to do it because under replica problem should not happens from beginning&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and maybe need to tune some HDFS parameters , but we not sure about this&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;please let us know about any advice that can help us&lt;/P&gt;</description>
    <pubDate>Sun, 17 Jan 2021 17:19:31 GMT</pubDate>
    <dc:creator>mike_bronson7</dc:creator>
    <dc:date>2021-01-17T17:19:31Z</dc:date>
    <item>
      <title>how to identify the problem about under replica blocks</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-identify-the-problem-about-under-replica-blocks/m-p/309858#M223946</link>
      <description>&lt;P&gt;we installed small HDP cluster with one data-node machine&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;HDP version is `2.6.5` and ambari version is `2.6.1`&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;so this is new cluster that contain two name-node with only one data-node ( worker machine )&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;the&amp;nbsp;interesting behavior that we see is that increasing of `under replica` on ambari dashboard , for now the number is `15000` under replica blocks&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;as we know the most root cause of this problem is network issues between name node to data-node&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;but this isn't the case in our hadoop cluster&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;we can also decrease the under replica by the following procedure&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;su - &amp;lt;$hdfs_user&amp;gt;&lt;BR /&gt;&lt;BR /&gt;bash-4.1$ hdfs fsck / | grep 'Under replicated' | awk -F':' '{print $1}' &amp;gt;&amp;gt; /tmp/under_replicated_files&lt;BR /&gt;&lt;BR /&gt;-bash-4.1$ for hdfsfile in `cat /tmp/under_replicated_files`; do echo "Fixing $hdfsfile :" ; hadoop fs -setrep 3 $hdfsfile; done&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;but we not want to do it because under replica problem should not happens from beginning&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;and maybe need to tune some HDFS parameters , but we not sure about this&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;please let us know about any advice that can help us&lt;/P&gt;</description>
      <pubDate>Sun, 17 Jan 2021 17:19:31 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-identify-the-problem-about-under-replica-blocks/m-p/309858#M223946</guid>
      <dc:creator>mike_bronson7</dc:creator>
      <dc:date>2021-01-17T17:19:31Z</dc:date>
    </item>
    <item>
      <title>Re: how to identify the problem about under replica blocks</title>
      <link>https://community.cloudera.com/t5/Support-Questions/how-to-identify-the-problem-about-under-replica-blocks/m-p/310470#M224168</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/59349"&gt;@mike_bronson7&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It seems to me like this is a symptom of having the &lt;STRONG&gt;default replication set to 3&lt;/STRONG&gt;. This is for redundancy and processing capability within HDFS. I&lt;SPAN&gt;t is recommended to have &lt;STRONG&gt;minimum 3 data nodes&lt;/STRONG&gt; in the cluster to accommodate 3 healthy replicas of a block (as we have a default&amp;nbsp;replication of 3).&amp;nbsp; HDFS will not write replicas of the same blocks to the same data node. In your scenario there will be under replicated blocks and 1 healthy replica will be placed on the available data node.&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You may run &lt;STRONG&gt;setrep&lt;/STRONG&gt; [1] to change the replication factor. I&lt;SPAN&gt;f you provide a&amp;nbsp;&lt;/SPAN&gt;&lt;I&gt;path to&amp;nbsp;&lt;/I&gt;&lt;SPAN&gt;a directory then the command recursively changes the replication factor of all files under the directory tree rooted at&amp;nbsp;&lt;/SPAN&gt;&lt;I&gt;path&lt;/I&gt;&lt;SPAN&gt;.&amp;nbsp;&lt;/SPAN&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;hdfs dfs -setrep -w 1 /user/hadoop/dir1&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;[1]&amp;nbsp;&lt;A href="https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#setrep" target="_blank" rel="noopener"&gt;https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/FileSystemShell.html#setrep&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 26 Jan 2021 23:16:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/how-to-identify-the-problem-about-under-replica-blocks/m-p/310470#M224168</guid>
      <dc:creator>MyNamesNotRick</dc:creator>
      <dc:date>2021-01-26T23:16:05Z</dc:date>
    </item>
  </channel>
</rss>

