<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Data Node Pause Duration in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305502#M222472</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/20288"&gt;@Shelton&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Apologies for the delay in replying. For my understanding, if possible, would you please explain how increasing NN Heap would fix DN Pause duration.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;&lt;P&gt;Amn&lt;/P&gt;</description>
    <pubDate>Fri, 06 Nov 2020 03:33:17 GMT</pubDate>
    <dc:creator>Amn_468</dc:creator>
    <dc:date>2020-11-06T03:33:17Z</dc:date>
    <item>
      <title>Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304869#M222223</link>
      <description>&lt;P&gt;Hello,&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;On our data node, we are increasing getting alerts related to Data Node Pause Duration. So far, this is happening on a single data node out of nine data nodes.&lt;/P&gt;&lt;P&gt;Following is the error captured from DN logs&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;2020-10-27 16:20:05,140 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1821ms GC pool 'ParNew' had collection(s): count=1 time=2075ms)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Current Java Heap Size of Data Node in Bytes is at 6GB&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;CM / CDH – 5.16.x&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Any help is appreciated.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards&lt;/P&gt;&lt;P&gt;Amn&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 14:39:15 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304869#M222223</guid>
      <dc:creator>Amn_468</dc:creator>
      <dc:date>2022-09-16T14:39:15Z</dc:date>
    </item>
    <item>
      <title>Re: Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304871#M222225</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/32334"&gt;@Amn_468&lt;/a&gt;&amp;nbsp;This is due to the Java Heap Size.&amp;nbsp;&lt;/P&gt;&lt;DIV class="cause"&gt;Let's say the default setting for the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;namenode_java_heapsize&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;is 1GB. Cloudera recommends having 1GB of heap space for every 1M blocks in a cluster.&lt;/DIV&gt;&lt;DIV class="cause"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="instruction"&gt;&lt;P&gt;If the data in your cluster is growing rapidly, factor in the potential future number of blocks your cluster will require when determining the size setting, so you can avoid having to restart the namenode. &amp;nbsp;It is only possible to change the setting by restarting the namenode.&lt;/P&gt;&lt;H2&gt;Calculating the Required Heap Size&lt;/H2&gt;&lt;OL&gt;&lt;LI&gt;Determine the number of&amp;nbsp;blocks in the cluster. This information is available on the namenode web UI under the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;STRONG&gt;Summary&lt;/STRONG&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;section, with information like the following:&lt;PRE&gt;117,387 files and directories, 56,875 blocks = 174,262 total filesystem object(s).&lt;/PRE&gt;Alternatively, the information is available from the output of the&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;&lt;SPAN&gt;fsck&lt;/SPAN&gt;&lt;SPAN&gt;&amp;nbsp;&lt;/SPAN&gt;command:&lt;UL&gt;&lt;LI&gt;&lt;PRE&gt;Total size:    9958827546 B (Total open files size: 93 B)
 Total dirs:    20397
 Total files:    57993
 Total symlinks:        0 (Files currently being written: 1)
 Total blocks (validated):    56874 (avg. block size 175103 B) (Total open file blocks (not validated): 1)
 ...&lt;/PRE&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;/LI&gt;&lt;/OL&gt;Given the number of blocks, allocate 1GB of heap space for each 1M blocks, plus some additional memory for growth. For example, if there are 6,543,567 blocks, you need 6.5GB of heap to cover the current cluster size, but 8GB would be a sensible setting to allow for growth of the cluster.&lt;/DIV&gt;&lt;DIV class="instruction"&gt;&amp;nbsp;&lt;/DIV&gt;&lt;DIV class="instruction"&gt;After that you can adjust the Java Heap Size for NN. Hope this helps.&amp;nbsp;&lt;/DIV&gt;</description>
      <pubDate>Tue, 27 Oct 2020 06:46:24 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304871#M222225</guid>
      <dc:creator>GangWar</dc:creator>
      <dc:date>2020-10-27T06:46:24Z</dc:date>
    </item>
    <item>
      <title>Re: Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304872#M222226</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/32334"&gt;@Amn_468&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Increasing the Java Heap Size for the NameNode and Secondary NameNode Services,you could be using the default 1GB setting for heap size&lt;/P&gt;&lt;P&gt;As a general rule of thumb take a look at the configuration of your Heap Sizes for every 1 Million Blocks in your cluster should have at least 1GB of Heap Size.&lt;/P&gt;&lt;UL&gt;&lt;LI&gt;2 Million Blocks 2GB heap size&lt;/LI&gt;&lt;LI&gt;3 Million Blocks 3GB heap size&lt;BR /&gt;.....&lt;/LI&gt;&lt;LI&gt;n Million Blocks n GB heap size&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;After increasing the Java Heap Size and restart the HDFS Services that should resolve the issue.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Please revert&lt;/P&gt;</description>
      <pubDate>Tue, 27 Oct 2020 06:55:56 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304872#M222226</guid>
      <dc:creator>Shelton</dc:creator>
      <dc:date>2020-10-27T06:55:56Z</dc:date>
    </item>
    <item>
      <title>Re: Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304880#M222229</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/29629"&gt;@GangWar&lt;/a&gt;&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/20288"&gt;@Shelton&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Appericate your assistance,&lt;/P&gt;&lt;P&gt;Following is the information available from NN WebUI- (23,326,719 files and directories, 22,735,340 blocks = 46,062,059 total filesystem object(s).&lt;/P&gt;&lt;P&gt;Heap Memory used 5.47 GB of 10.6 GB Heap Memory. Max Heap Memory is 10.6 GB.&lt;/P&gt;&lt;P&gt;Non Heap Memory used 120.51 MB of 122.7 MB Commited Non Heap Memory. Max Non Heap Memory is &amp;lt;unbounded&amp;gt;.)&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Could you please re-confirm whether I need to adjust the NN Heap Memory OR DN heap memory, as the issue is seen on data Node and that too only one data node other 8 seem to be running without any issues.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks&amp;nbsp;&lt;/P&gt;&lt;P&gt;Amn&lt;/P&gt;</description>
      <pubDate>Tue, 27 Oct 2020 07:41:19 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304880#M222229</guid>
      <dc:creator>Amn_468</dc:creator>
      <dc:date>2020-10-27T07:41:19Z</dc:date>
    </item>
    <item>
      <title>Re: Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304881#M222230</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/32334"&gt;@Amn_468&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The NameNode is solely responsible for the Cluster Metadata so please increase the NN heap size and restart the services.&lt;BR /&gt;Please revert&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 27 Oct 2020 07:58:55 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/304881#M222230</guid>
      <dc:creator>Shelton</dc:creator>
      <dc:date>2020-10-27T07:58:55Z</dc:date>
    </item>
    <item>
      <title>Re: Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305502#M222472</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/20288"&gt;@Shelton&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Apologies for the delay in replying. For my understanding, if possible, would you please explain how increasing NN Heap would fix DN Pause duration.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks in advance&lt;/P&gt;&lt;P&gt;Amn&lt;/P&gt;</description>
      <pubDate>Fri, 06 Nov 2020 03:33:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305502#M222472</guid>
      <dc:creator>Amn_468</dc:creator>
      <dc:date>2020-11-06T03:33:17Z</dc:date>
    </item>
    <item>
      <title>Re: Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305583#M222506</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/32334"&gt;@Amn_468&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The Namenode is the brain of the cluster, it has the footprint of the cluster location of the files, ACL's, stores the HDFS metadata, the directory tree of all files in the file system, and tracks the files across the cluster and does not store the actual data or the dataset. The data itself is actually stored in the Datanodes.&lt;/P&gt;&lt;P&gt;Your error&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;2020-10-27 16:20:05,140 INFO org.apache.hadoop.util.JvmPauseMonitor: Detected pause in JVM or host machine (eg GC): pause of approximately 1821ms GC pool 'ParNew' had collection(s): count=1 time=2075ms)&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT color="#3366FF"&gt;This indicates that the NameNode paused for longer than the expected time of 60000ms. This also explains why DataNode did not get a response from NameNode in designated 60000ms.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The warning also indicates that the pause was due to GC which calls for a memory and GC Tuning.&lt;/P&gt;&lt;P&gt;NameNode knows the location, list of the blocks with this information NameNode knows how to construct the file from blocks. The fastest way to render this information is to store it in memory that's the reason the NN is usually on a high-end server configured with a lot of memory (RAM). because the block locations are stored in RAM&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;An ideal starter config in production for a datanode and Namende would be&lt;/P&gt;&lt;P&gt;Name Node Configuration&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Processors: 2 Quad Core CPUs running @ 2 GHz
RAM: 128 GB
Disk: 6 x 1TB SATA
Network: 10 Gigabit Ethernet&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Data Node Configuration&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;Processors: 2 Quad Core CPUs running @ 2 GHz
RAM: 64 GB
Disk: 12-24 x 1TB SATA
Network: 10 Gigabit Ethernet&lt;/LI-CODE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;A fundamental parameter to tune for garbage collectors is the number of HDFS blocks stored in the Hadoop cluster in your case &lt;FONT color="#FF0000"&gt;23,326,719 files&lt;/FONT&gt;. The number of files, and associated blocks, is a fundamental parameter in the tuning process. The Namenode maintains the complete directory structure in memory. Therefore, more files mean more objects to manage. Most of the time, Hadoop clusters are configured without knowledge of the final workload in terms of the number of files that will be stored. Having in mind the strong connection between these two aspects is crucial to anticipate future turbulence in the hdfs quality of service.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You should analyze log prints produced by the garbage collector the gc.log files found in the Namenode logs directory the available memory is filling up before the garbage collector activity is able to release it.&lt;/P&gt;&lt;P&gt;Hope that helps&lt;/P&gt;</description>
      <pubDate>Sun, 08 Nov 2020 11:31:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305583#M222506</guid>
      <dc:creator>Shelton</dc:creator>
      <dc:date>2020-11-08T11:31:16Z</dc:date>
    </item>
    <item>
      <title>Re: Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305661#M222537</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/32334"&gt;@Amn_468&lt;/a&gt;&amp;nbsp;The DN Pause alert you see for 1/9 DataNodes are indication of growing blocks on it.&lt;/P&gt;&lt;P&gt;Compared to other DNs, possibly this DN in question have stored more number of blocks than other nodes. You may compare the block counts of each DN in HDFS &amp;gt; HDFS &amp;gt; WebUI &amp;gt; Active NN Web UI &amp;gt; DataNodes &amp;gt; Cehck the blocks column under section "In Operation".&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;The log snippet you shared indicates a pause of 2sec only, which is not sign of worry. However, with proper JVM heap size allocated for DN, you may avoid these frequent pause alerts.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;As a thumb rule you may need 1GB heap for 1Million blocks and since you have 6GB allocated for DN heap, please verify the block counts on the DNs and ensure they are not too high (&amp;gt; 6Millions) in count which may explain why there are so many pause alerts.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;In case the block count is too high than expected, it means you need to increase the heap size to accomodate the block objects in JVM heap memeory.&amp;nbsp;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;On a. side note, growing block counts also an early warning/indication of small files problem in cluster. You need to be vigilant about that. Verify the average block size and that would help you to understand, if you are having small files problem in your cluster.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Pabitra Das&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 09 Nov 2020 16:29:32 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305661#M222537</guid>
      <dc:creator>PabitraDas</dc:creator>
      <dc:date>2020-11-09T16:29:32Z</dc:date>
    </item>
    <item>
      <title>Re: Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305770#M222561</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/21311"&gt;@PabitraDas&lt;/a&gt;,&lt;/P&gt;&lt;P&gt;&lt;FONT size="4"&gt;Appreciate your assistance, below is the block count on our DNs, as mentioned earlier we have allocated 6 GB JVM Heap for DN's and 10 GB Heap for NN &amp;amp; SNN. &lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="4"&gt;Do you suggest to increase DN Heap, or NN / SNN Heap as suggested by Shelton.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="4"&gt;Block Count:&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="4"&gt;Node 1 = 7421379&amp;nbsp;&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="4"&gt;Node 2 = 5569699&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="4"&gt;Node 3 = 6003009&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="4"&gt;Node 4 = 7444205&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="4"&gt;Node 5 = 8770674&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="4"&gt;Node 6 = 8849641&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="4"&gt;Node 7 = 8232779&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="4"&gt;Node 8 = 8354714&lt;/FONT&gt;&lt;BR /&gt;&lt;FONT size="4"&gt;Node 9 = 8860602&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="4"&gt;Also, would greatly appreciate if you have any pointers / suggestions (scripts etc. ) to identify small file issue and possible remediation.&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&lt;FONT size="4"&gt;Thanks&amp;nbsp;&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT size="4"&gt;Amn&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 11 Nov 2020 06:59:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305770#M222561</guid>
      <dc:creator>Amn_468</dc:creator>
      <dc:date>2020-11-11T06:59:58Z</dc:date>
    </item>
    <item>
      <title>Re: Data Node Pause Duration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305780#M222565</link>
      <description>&lt;P&gt;Hello&amp;nbsp;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/32334"&gt;@Amn_468&lt;/a&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Since you reported the DN Pause time, I spoke/referred about DN heap only.&amp;nbsp; The block counts on most of the DN seems &amp;gt;6Millions, hence would suggest to increase the DN heap to 8GB (from current value of 6GB) and perorm a rolling restart to bring the new heap size into effect.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;There is no straight forward way to say you hit the small file problem but if your average block size is few MB or less than a MB in size, it is an indication that you are storing/accumulating small files in HDFS.&amp;nbsp; Simplest way to determine small files in cluster is to run fsck.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;Fsck should show the average block size. If it's too low a value (eg ~ 1MB ), you might be hitting the problems of small files which would be worth looking at, otherwise, there is no need to review the number of blocks.&lt;/P&gt;&lt;P class="p2"&gt;&amp;nbsp;&lt;/P&gt;&lt;P class="p1"&gt;[..]&lt;/P&gt;&lt;P class="p1"&gt;$ hdfs fsck /&lt;/P&gt;&lt;P class="p1"&gt;..&lt;/P&gt;&lt;P class="p1"&gt;...&lt;/P&gt;&lt;P class="p1"&gt;Total blocks (validated): 2899 (avg. block size 11475601 B) &amp;lt;&amp;lt;&amp;lt;&amp;lt;&amp;lt;&lt;/P&gt;&lt;P class="p1"&gt;[..]&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;You may refer belwo links for your help on dealing with small files.&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&lt;A href="https://blog.cloudera.com/small-files-big-foils-addressing-the-associated-metadata-and-application-challenges/" target="_blank"&gt;https://blog.cloudera.com/small-files-big-foils-addressing-the-associated-metadata-and-application-challenges/&lt;/A&gt;&lt;/P&gt;&lt;P&gt;-&amp;nbsp;&lt;A href="https://community.cloudera.com/t5/Community-Articles/Identify-where-most-of-the-small-file-are-located-in-a-large/ta-p/247253" target="_blank"&gt;https://community.cloudera.com/t5/Community-Articles/Identify-where-most-of-the-small-file-are-located-in-a-large/ta-p/247253&lt;/A&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Wed, 11 Nov 2020 10:06:10 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Data-Node-Pause-Duration/m-p/305780#M222565</guid>
      <dc:creator>PabitraDas</dc:creator>
      <dc:date>2020-11-11T10:06:10Z</dc:date>
    </item>
  </channel>
</rss>

