<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question HDFS attempting to use invalid datanodes when StoragePolicies are configured in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/HDFS-attempting-to-use-invalid-datanodes-when/m-p/312884#M225307</link>
    <description>&lt;P&gt;Hi,&lt;BR /&gt;I have a test cluster running HDP3.15 and Ambari2.7.5 where, using Ambari config groups, I've confgured 3 datanodes with the following values of "dfs.datanode.data.dir":&lt;BR /&gt;- dn-1: "[SSD]file:///dn_vg1/vol1_ssd"&lt;/P&gt;
&lt;P&gt;- dn-2: "[SSD]file:///dn_vg1/vol1_ssd,[SSD]file:///dn_vg2/vol2_ssd"&lt;/P&gt;
&lt;P&gt;- dn-3: "[DISK]file:///dn_vg1/vol1_disk,[SSD]file:///dn_vg3/vol3_ssd,[DISK]file:///dn_vg2/vol2_disk"&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"dfs.replication" is set to 1 and the storage policies are all default (HOT).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Most of my attempts to "hdfs dfs -put" a file into HDFS fail with:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;2021-03-11 14:58:33,315 WARN  blockmanagement.BlockPlacementPolicy (BlockPlacementPolicyDefault.java:chooseTarget(432)) - Failed to place enough replicas, still in need of 1 to reach 1 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So it appears that HDFS is attempts to use an invalid datanode (datanodes with no "DISK" storage type) before realising it's missing a "DISK" storage type.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Running "hdfs storagepolicies -setStoragePolicy -path / -policy All_SSD" makes it so that all "hdfs dfs -put"s go through without issue.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I've tried running "hadoop daemonlog -setlevel &amp;lt;namenode&amp;gt;:&amp;lt;port&amp;gt; org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy DEBUG" to get debug logs but that returns:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;Connecting to http://&amp;lt;namenode&amp;gt;:&amp;lt;port&amp;gt;/logLevel?log=org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy&amp;amp;level=DEBUG
Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: http://&amp;lt;namenode&amp;gt;:&amp;lt;port&amp;gt;/logLevel?log=org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy&amp;amp;level=DEBUG
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1900)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
	at org.apache.hadoop.log.LogLevel$CLI.process(LogLevel.java:297)
	at org.apache.hadoop.log.LogLevel$CLI.doSetLevel(LogLevel.java:244)
	at org.apache.hadoop.log.LogLevel$CLI.sendLogLevelRequest(LogLevel.java:130)
	at org.apache.hadoop.log.LogLevel$CLI.run(LogLevel.java:110)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
	at org.apache.hadoop.log.LogLevel.main(LogLevel.java:72)&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How can I ensure HDFS selects the correct datanodes?&lt;/P&gt;
&lt;P&gt;Any help would be appreciated.&lt;/P&gt;</description>
    <pubDate>Thu, 11 Mar 2021 19:50:35 GMT</pubDate>
    <dc:creator>Babar</dc:creator>
    <dc:date>2021-03-11T19:50:35Z</dc:date>
    <item>
      <title>HDFS attempting to use invalid datanodes when StoragePolicies are configured</title>
      <link>https://community.cloudera.com/t5/Support-Questions/HDFS-attempting-to-use-invalid-datanodes-when/m-p/312884#M225307</link>
      <description>&lt;P&gt;Hi,&lt;BR /&gt;I have a test cluster running HDP3.15 and Ambari2.7.5 where, using Ambari config groups, I've confgured 3 datanodes with the following values of "dfs.datanode.data.dir":&lt;BR /&gt;- dn-1: "[SSD]file:///dn_vg1/vol1_ssd"&lt;/P&gt;
&lt;P&gt;- dn-2: "[SSD]file:///dn_vg1/vol1_ssd,[SSD]file:///dn_vg2/vol2_ssd"&lt;/P&gt;
&lt;P&gt;- dn-3: "[DISK]file:///dn_vg1/vol1_disk,[SSD]file:///dn_vg3/vol3_ssd,[DISK]file:///dn_vg2/vol2_disk"&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;"dfs.replication" is set to 1 and the storage policies are all default (HOT).&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Most of my attempts to "hdfs dfs -put" a file into HDFS fail with:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;2021-03-11 14:58:33,315 WARN  blockmanagement.BlockPlacementPolicy (BlockPlacementPolicyDefault.java:chooseTarget(432)) - Failed to place enough replicas, still in need of 1 to reach 1 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;So it appears that HDFS is attempts to use an invalid datanode (datanodes with no "DISK" storage type) before realising it's missing a "DISK" storage type.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;Running "hdfs storagepolicies -setStoragePolicy -path / -policy All_SSD" makes it so that all "hdfs dfs -put"s go through without issue.&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;I've tried running "hadoop daemonlog -setlevel &amp;lt;namenode&amp;gt;:&amp;lt;port&amp;gt; org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy DEBUG" to get debug logs but that returns:&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;LI-CODE lang="markup"&gt;Connecting to http://&amp;lt;namenode&amp;gt;:&amp;lt;port&amp;gt;/logLevel?log=org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy&amp;amp;level=DEBUG
Exception in thread "main" java.io.IOException: Server returned HTTP response code: 403 for URL: http://&amp;lt;namenode&amp;gt;:&amp;lt;port&amp;gt;/logLevel?log=org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy&amp;amp;level=DEBUG
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1900)
	at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1498)
	at org.apache.hadoop.log.LogLevel$CLI.process(LogLevel.java:297)
	at org.apache.hadoop.log.LogLevel$CLI.doSetLevel(LogLevel.java:244)
	at org.apache.hadoop.log.LogLevel$CLI.sendLogLevelRequest(LogLevel.java:130)
	at org.apache.hadoop.log.LogLevel$CLI.run(LogLevel.java:110)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
	at org.apache.hadoop.log.LogLevel.main(LogLevel.java:72)&lt;/LI-CODE&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;&amp;nbsp;&lt;/P&gt;
&lt;P&gt;How can I ensure HDFS selects the correct datanodes?&lt;/P&gt;
&lt;P&gt;Any help would be appreciated.&lt;/P&gt;</description>
      <pubDate>Thu, 11 Mar 2021 19:50:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/HDFS-attempting-to-use-invalid-datanodes-when/m-p/312884#M225307</guid>
      <dc:creator>Babar</dc:creator>
      <dc:date>2021-03-11T19:50:35Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS attempting to use invalid datanodes when StoragePolicies are configured</title>
      <link>https://community.cloudera.com/t5/Support-Questions/HDFS-attempting-to-use-invalid-datanodes-when/m-p/312965#M225339</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/86572"&gt;@Babar&lt;/a&gt;, It seems the&amp;nbsp; DN disk configuration (&lt;STRONG&gt;dfs.datanode.data.dir&lt;/STRONG&gt;) is not appropriate. Could you please configure the disks as cited here -&amp;nbsp;&lt;A href="https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_heterogeneous_storage_oview.html#admin_heterogeneous_storage_config" target="_blank"&gt;https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_heterogeneous_storage_oview.html#admin_heterogeneous_storage_config&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;If your SSD disk is mounted as below:&lt;/P&gt;&lt;P&gt;/dn_vg1/vol1_ssd -----&amp;gt; mounted as ----&amp;gt; /data/1&lt;/P&gt;&lt;P&gt;/dn_vg2/vol2_ssd -----&amp;gt; mounted as -----&amp;gt; /data/2&lt;/P&gt;&lt;P&gt;/dn_vg3/vol3_ssd -----&amp;gt; mounted as -----&amp;gt; /data/3&lt;/P&gt;&lt;P&gt;and scsi/sata disks are mounted as below:&amp;nbsp;&lt;/P&gt;&lt;P&gt;/dn_vg1/vol1_disk&amp;nbsp; -----&amp;gt; mounted as ----&amp;gt; /data/4&lt;/P&gt;&lt;P&gt;/dn_vg2/vol2_disk ------&amp;gt; mounted as -----&amp;gt; /data/5&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Then configure the DN data directories (&lt;STRONG&gt;dfs.datanode.data.dir&lt;/STRONG&gt;) as follows:&lt;/P&gt;&lt;P&gt;- dn-1: "[SSD]/data/1/dfs/dn"&lt;/P&gt;&lt;P&gt;- dn-2: "[SSD]/data/1/dfs/dn,[SSD]/data/2/dfs/dn"&lt;/P&gt;&lt;P&gt;- dn-3: "[DISK]/data/4/dfs/dn,[SSD]/data/3/dfs/dn,[DISK]/data/5/dfs/dn"&lt;/P&gt;&lt;P&gt;You need to create the /dfs/dn directories with ownership of hdfs:hadoop and permission of 700 on each mount point so that the volume can be used to store the blocks.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please check the mount points and reconfigure the data directories.&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 12 Mar 2021 18:47:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/HDFS-attempting-to-use-invalid-datanodes-when/m-p/312965#M225339</guid>
      <dc:creator>PabitraDas</dc:creator>
      <dc:date>2021-03-12T18:47:55Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS attempting to use invalid datanodes when StoragePolicies are configured</title>
      <link>https://community.cloudera.com/t5/Support-Questions/HDFS-attempting-to-use-invalid-datanodes-when/m-p/312985#M225351</link>
      <description>&lt;P&gt;I hit the same error after applying your suggested changes. However I think I've "fixed" it.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;I managed to get "hadoop daemonlog" commands working by adding group "hadoop" to "&lt;SPAN class="form-group control-label-span col-md-3"&gt;&lt;SPAN class="ember-view"&gt;dfs.permissions.superusergroup&lt;/SPAN&gt;&lt;/SPAN&gt;" and "dfs.cluster.administrators". Turns out I had the same problem as described here, &lt;A href="https://www.gresearch.co.uk/article/hdfs-troubleshooting-why-does-a-tier-get-blacklisted/" target="_blank" rel="noopener"&gt;https://www.gresearch.co.uk/article/hdfs-troubleshooting-why-does-a-tier-get-blacklisted/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For now I've set "dfs.namenode.replication.considerLoad.factor" to 3 which solved the problem. Eventually a proper fix for the block placement policy to account for Storage policies will come with Hadoop3.4 whenever that releases, &lt;A href="https://issues.apache.org/jira/browse/HDFS-14383" target="_blank" rel="noopener"&gt;https://issues.apache.org/jira/browse/HDFS-14383&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Sat, 13 Mar 2021 23:44:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/HDFS-attempting-to-use-invalid-datanodes-when/m-p/312985#M225351</guid>
      <dc:creator>Babar</dc:creator>
      <dc:date>2021-03-13T23:44:43Z</dc:date>
    </item>
    <item>
      <title>Re: HDFS attempting to use invalid datanodes when StoragePolicies are configured</title>
      <link>https://community.cloudera.com/t5/Support-Questions/HDFS-attempting-to-use-invalid-datanodes-when/m-p/313028#M225374</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/86572"&gt;@Babar&lt;/a&gt;&amp;nbsp;Thank you for resolving the issue and marking the thread as solved.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Glad to know that you identify the problem and resolved it.&amp;nbsp;Please note HDFS-14383 (Compute datanode load based on StoragePolicy) has been included in the recent release of CDP 7.1.5 and 7.2.x&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 Mar 2021 11:40:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/HDFS-attempting-to-use-invalid-datanodes-when/m-p/313028#M225374</guid>
      <dc:creator>PabitraDas</dc:creator>
      <dc:date>2021-03-15T11:40:40Z</dc:date>
    </item>
  </channel>
</rss>

