<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question DATA_NODE_WEB_METRIC_COLLECTION has become bad in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23059#M4251</link>
    <description>&lt;P&gt;Dear all,&lt;BR /&gt;&lt;BR /&gt;Version: Cloudera Express 5.0.2&lt;BR /&gt;3 master nodes&lt;BR /&gt;15 workers&lt;BR /&gt;&lt;BR /&gt;Problem:&lt;BR /&gt;"The health test result for DATA_NODE_WEB_METRIC_COLLECTION has become bad: The Cloudera Manager Agent is not able to communicate with this role's web server."&amp;nbsp;&lt;/P&gt;&lt;P&gt;When above alert pops up such record were noticed in datanode logs:&lt;BR /&gt;"INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3121ms"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Alerts are throwing from specific group of datanodes, not from all.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What can be the problem here?&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;&lt;P&gt;Sergey&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 09:16:25 GMT</pubDate>
    <dc:creator>szemlyanoy</dc:creator>
    <dc:date>2022-09-16T09:16:25Z</dc:date>
    <item>
      <title>DATA_NODE_WEB_METRIC_COLLECTION has become bad</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23059#M4251</link>
      <description>&lt;P&gt;Dear all,&lt;BR /&gt;&lt;BR /&gt;Version: Cloudera Express 5.0.2&lt;BR /&gt;3 master nodes&lt;BR /&gt;15 workers&lt;BR /&gt;&lt;BR /&gt;Problem:&lt;BR /&gt;"The health test result for DATA_NODE_WEB_METRIC_COLLECTION has become bad: The Cloudera Manager Agent is not able to communicate with this role's web server."&amp;nbsp;&lt;/P&gt;&lt;P&gt;When above alert pops up such record were noticed in datanode logs:&lt;BR /&gt;"INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 3121ms"&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Alerts are throwing from specific group of datanodes, not from all.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;What can be the problem here?&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;&lt;P&gt;Sergey&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 09:16:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23059#M4251</guid>
      <dc:creator>szemlyanoy</dc:creator>
      <dc:date>2022-09-16T09:16:25Z</dc:date>
    </item>
    <item>
      <title>Re: DATA_NODE_WEB_METRIC_COLLECTION has become bad</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23065#M4252</link>
      <description>It is possible that the datanode is handling more blocks or dealing&lt;BR /&gt;with more traffic than its heap will allow. So there might be frequent&lt;BR /&gt;full garbage collection occurring which can cause such events.&lt;BR /&gt;&lt;BR /&gt;How many blocks do these datanodes have? What is the heap setting?&lt;BR /&gt;&lt;BR /&gt;</description>
      <pubDate>Wed, 24 Dec 2014 11:23:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23065#M4252</guid>
      <dc:creator>GautamG</dc:creator>
      <dc:date>2014-12-24T11:23:53Z</dc:date>
    </item>
    <item>
      <title>Re: DATA_NODE_WEB_METRIC_COLLECTION has become bad</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23067#M4253</link>
      <description>&lt;P&gt;Yes one of my idea is about skewed data usage across datanodes.&amp;nbsp;&lt;/P&gt;&lt;P&gt;I explored the data usage of nodes and noticed that those workers which triggers alerts have more block usage&lt;/P&gt;&lt;P&gt;bellow is comparison of sane nodes with the alerting ones&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;sane group&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Capacity Used Non DFS Used Remaining Blocks Block pool used&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;14.21 TB&lt;/TD&gt;&lt;TD&gt;1.64 TB&lt;/TD&gt;&lt;TD&gt;664.86 GB&lt;/TD&gt;&lt;TD&gt;11.92 TB&lt;/TD&gt;&lt;TD&gt;127220&lt;/TD&gt;&lt;TD&gt;1.64 TB (11.55%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;14.21 TB&lt;/TD&gt;&lt;TD&gt;6.14 TB&lt;/TD&gt;&lt;TD&gt;666.38 GB&lt;/TD&gt;&lt;TD&gt;7.42 TB&lt;/TD&gt;&lt;TD&gt;639918&lt;/TD&gt;&lt;TD&gt;6.14 TB (43.23%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;14.21 TB&lt;/TD&gt;&lt;TD&gt;4.99 TB&lt;/TD&gt;&lt;TD&gt;665.79 GB&lt;/TD&gt;&lt;TD&gt;8.57 TB&lt;/TD&gt;&lt;TD&gt;465164&lt;/TD&gt;&lt;TD&gt;4.99 TB (35.11%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;14.21 TB&lt;/TD&gt;&lt;TD&gt;7.06 TB&lt;/TD&gt;&lt;TD&gt;666.4 GB&lt;/TD&gt;&lt;TD&gt;6.49 TB&lt;/TD&gt;&lt;TD&gt;795556&lt;/TD&gt;&lt;TD&gt;7.06 TB (49.71%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;14.21 TB&lt;/TD&gt;&lt;TD&gt;4.74 TB&lt;/TD&gt;&lt;TD&gt;665.74 GB&lt;/TD&gt;&lt;TD&gt;8.82 TB&lt;/TD&gt;&lt;TD&gt;445655&lt;/TD&gt;&lt;TD&gt;4.74 TB (33.35%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;14.21 TB&lt;/TD&gt;&lt;TD&gt;7.95 TB&lt;/TD&gt;&lt;TD&gt;666.13 GB&lt;/TD&gt;&lt;TD&gt;5.61 TB&lt;/TD&gt;&lt;TD&gt;907730&lt;/TD&gt;&lt;TD&gt;7.95 TB (55.96%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;14.21 TB&lt;/TD&gt;&lt;TD&gt;6.13 TB&lt;/TD&gt;&lt;TD&gt;666.08 GB&lt;/TD&gt;&lt;TD&gt;7.43 TB&lt;/TD&gt;&lt;TD&gt;640631&lt;/TD&gt;&lt;TD&gt;6.13 TB (43.12%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;group with issues&lt;/P&gt;&lt;P&gt;Capacity Used Non DFS Used Remaining Blocks Block pool used&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;10.65 TB&lt;/TD&gt;&lt;TD&gt;8.96 TB&lt;/TD&gt;&lt;TD&gt;500.07 GB&lt;/TD&gt;&lt;TD&gt;1.2 TB&lt;/TD&gt;&lt;TD&gt;1175053&lt;/TD&gt;&lt;TD&gt;8.96 TB (84.13%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;10.65 TB&lt;/TD&gt;&lt;TD&gt;8.57 TB&lt;/TD&gt;&lt;TD&gt;499.76 GB&lt;/TD&gt;&lt;TD&gt;1.59 TB&lt;/TD&gt;&lt;TD&gt;1136687&lt;/TD&gt;&lt;TD&gt;8.57 TB (80.51%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;14.21 TB&lt;/TD&gt;&lt;TD&gt;8.94 TB&lt;/TD&gt;&lt;TD&gt;666.97 GB&lt;/TD&gt;&lt;TD&gt;4.62 TB&lt;/TD&gt;&lt;TD&gt;1209608&lt;/TD&gt;&lt;TD&gt;8.94 TB (62.89%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;10.65 TB&lt;/TD&gt;&lt;TD&gt;8.65 TB&lt;/TD&gt;&lt;TD&gt;500.16 GB&lt;/TD&gt;&lt;TD&gt;1.5 TB&lt;/TD&gt;&lt;TD&gt;1133144&lt;/TD&gt;&lt;TD&gt;8.65 TB (81.28%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;14.21 TB&lt;/TD&gt;&lt;TD&gt;8.98 TB&lt;/TD&gt;&lt;TD&gt;665.07 GB&lt;/TD&gt;&lt;TD&gt;4.58 TB&lt;/TD&gt;&lt;TD&gt;1225707&lt;/TD&gt;&lt;TD&gt;8.98 TB (63.19%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;10.65 TB&lt;/TD&gt;&lt;TD&gt;8.62 TB&lt;/TD&gt;&lt;TD&gt;499.82 GB&lt;/TD&gt;&lt;TD&gt;1.54 TB&lt;/TD&gt;&lt;TD&gt;1168257&lt;/TD&gt;&lt;TD&gt;8.62 TB (80.98%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;TABLE&gt;&lt;TBODY&gt;&lt;TR&gt;&lt;TD&gt;10.65 TB&lt;/TD&gt;&lt;TD&gt;8.94 TB&lt;/TD&gt;&lt;TD&gt;499.75 GB&lt;/TD&gt;&lt;TD&gt;1.22 TB&lt;/TD&gt;&lt;TD&gt;1172198&lt;/TD&gt;&lt;TD&gt;8.94 TB (83.98%)&lt;/TD&gt;&lt;/TR&gt;&lt;/TBODY&gt;&lt;/TABLE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Notable that the ill ones have more blocks in the pool.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Heap size for DataNode Default Group - 1gb&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 24 Dec 2014 12:20:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23067#M4253</guid>
      <dc:creator>szemlyanoy</dc:creator>
      <dc:date>2014-12-24T12:20:18Z</dc:date>
    </item>
    <item>
      <title>Re: DATA_NODE_WEB_METRIC_COLLECTION has become bad</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23069#M4254</link>
      <description>It might be best to run the HDFS Balancer on a regular basis to remedy this. If you're running CDH 5.0.x or CDH 5.1.[0-3]. then consider upgrading to CDH 5.1.4 or CDH 5.2.0 for the fix to HDFS-6621.</description>
      <pubDate>Wed, 24 Dec 2014 12:42:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23069#M4254</guid>
      <dc:creator>GautamG</dc:creator>
      <dc:date>2014-12-24T12:42:51Z</dc:date>
    </item>
    <item>
      <title>Re: DATA_NODE_WEB_METRIC_COLLECTION has become bad</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23071#M4255</link>
      <description>&lt;P&gt;Hi Guatam,&lt;/P&gt;&lt;P&gt;Yes we run balancer on regular basis but seems we are hitting this bug. We have plans to upgrade CM stack but is the current issue related to balancer bugs?&lt;/P&gt;&lt;P&gt;Is there some relation between skewed balancer and web metrics alerts?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&lt;/P&gt;&lt;P&gt;Sergey&lt;/P&gt;</description>
      <pubDate>Wed, 24 Dec 2014 13:15:08 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23071#M4255</guid>
      <dc:creator>szemlyanoy</dc:creator>
      <dc:date>2014-12-24T13:15:08Z</dc:date>
    </item>
    <item>
      <title>Re: DATA_NODE_WEB_METRIC_COLLECTION has become bad</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23101#M4256</link>
      <description>&lt;P&gt;From what I understand till now, the issues only appear on datanodes which are containing a large number of blocks and these datanodes contain far more blocks than the healthy ones. This can be remedied by running the HDFS Balancer.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In CDH 5.x, bug HDFS-6621 affects balancer performance. It is fixed in the GA releases 5.1.4 and 5.2.0 (and later versions like 5.3.0). It is not fixed in any 5.0.x version. So please consider upgrading to one of the above releases for the fix.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 26 Dec 2014 13:13:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/DATA-NODE-WEB-METRIC-COLLECTION-has-become-bad/m-p/23101#M4256</guid>
      <dc:creator>GautamG</dc:creator>
      <dc:date>2014-12-26T13:13:14Z</dc:date>
    </item>
  </channel>
</rss>

