<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: How to make data read faster in HDFS in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-make-data-read-faster-in-HDFS/m-p/117556#M30622</link>
    <description>&lt;P&gt;Thank you &lt;A rel="user" href="https://community.cloudera.com/users/566/jing.html" nodeid="566"&gt;@Jing Zhao&lt;/A&gt;, the hedge read is actually quite useful as a result of multiplexing.&lt;/P&gt;</description>
    <pubDate>Fri, 03 Jun 2016 07:19:48 GMT</pubDate>
    <dc:creator>xzhou</dc:creator>
    <dc:date>2016-06-03T07:19:48Z</dc:date>
    <item>
      <title>How to make data read faster in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-make-data-read-faster-in-HDFS/m-p/117554#M30620</link>
      <description>&lt;P&gt;For some unknown reasons, data read through DataNode could be very slow. In addition to troubleshooting root cause of slowness, are there any alternative ways (e.g. different input channels) but with the same semantics to make read potentially faster? Thanks.&lt;/P&gt;</description>
      <pubDate>Fri, 03 Jun 2016 06:39:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-make-data-read-faster-in-HDFS/m-p/117554#M30620</guid>
      <dc:creator>xzhou</dc:creator>
      <dc:date>2016-06-03T06:39:27Z</dc:date>
    </item>
    <item>
      <title>Re: How to make data read faster in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-make-data-read-faster-in-HDFS/m-p/117555#M30621</link>
      <description>&lt;P&gt;First you may need to figure out the root cause for the read slowness: network issue? Slow disk? You can identify the corresponding DataNode that serves the data and then check its metrics to help debugging the issue.&lt;/P&gt;&lt;P&gt;In the meanwhile, if the read is "position read", i.e., the read is called through API read(long, byte[], int, int), you can enable hedge read in DFSClient by setting the configuration "dfs.client.hedged.read.threadpool.size" to a non-zero number. Hedge read allows the reader to start reading from another DataNode (since there are usually 3 replicas) before the first read attempt finishes, if the reader thinks the first DataNode it read from is slow.&lt;/P&gt;</description>
      <pubDate>Fri, 03 Jun 2016 07:05:12 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-make-data-read-faster-in-HDFS/m-p/117555#M30621</guid>
      <dc:creator>jing</dc:creator>
      <dc:date>2016-06-03T07:05:12Z</dc:date>
    </item>
    <item>
      <title>Re: How to make data read faster in HDFS</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-make-data-read-faster-in-HDFS/m-p/117556#M30622</link>
      <description>&lt;P&gt;Thank you &lt;A rel="user" href="https://community.cloudera.com/users/566/jing.html" nodeid="566"&gt;@Jing Zhao&lt;/A&gt;, the hedge read is actually quite useful as a result of multiplexing.&lt;/P&gt;</description>
      <pubDate>Fri, 03 Jun 2016 07:19:48 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/How-to-make-data-read-faster-in-HDFS/m-p/117556#M30622</guid>
      <dc:creator>xzhou</dc:creator>
      <dc:date>2016-06-03T07:19:48Z</dc:date>
    </item>
  </channel>
</rss>

