<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: hdfs file actual block paths in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/hdfs-file-actual-block-paths/m-p/121413#M34283</link>
    <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/11718/leonli.html" nodeid="11718"&gt;@Leon L&lt;/A&gt;, the easiest way to do so from the command line, if you are an administrator, is run the 'fsck' command with the &lt;EM&gt;-files -blocks -locations&lt;/EM&gt; options. e.g.&lt;/P&gt;&lt;PRE&gt;$ hdfs fsck /myfile.txt -files -blocks -locations
Connecting to namenode via &lt;A href="http://localhost:50070" target="_blank"&gt;http://localhost:50070&lt;/A&gt;
FSCK started by someuser (auth:SIMPLE) from /127.0.0.1 for path /myfile.txt at Sun Jul 10 17:55:32 PDT 2016
/myfile.txt 875664 bytes, 1 block(s):  OK
0. BP-810817926-127.0.0.1-1468198364624:blk_1073741825_1001 len=875664 repl=1 [127.0.0.1:50010]
&lt;/PRE&gt;&lt;P&gt;This will return a list of blocks along with which DataNodes that have the replicas of each block. This is a one off solution if you need to get the block locations for a small number of files. There is no publicly available API to query the block locations for a file that I know of.&lt;/P&gt;&lt;P&gt;Could you please explain your use case?&lt;/P&gt;</description>
    <pubDate>Mon, 11 Jul 2016 08:16:06 GMT</pubDate>
    <dc:creator>ArpitAgarwal</dc:creator>
    <dc:date>2016-07-11T08:16:06Z</dc:date>
    <item>
      <title>hdfs file actual block paths</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/hdfs-file-actual-block-paths/m-p/121411#M34281</link>
      <description>&lt;P&gt;Is there a way to use the HDFS API to get a list of blocks and the data nodes that store a particular HDFS file?&lt;/P&gt;&lt;P&gt;If that's not possible, at a minimum, is there a way to determine which data nodes store a particular HDFS file?&lt;/P&gt;</description>
      <pubDate>Sat, 09 Jul 2016 23:52:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/hdfs-file-actual-block-paths/m-p/121411#M34281</guid>
      <dc:creator>leon_li</dc:creator>
      <dc:date>2016-07-09T23:52:40Z</dc:date>
    </item>
    <item>
      <title>Re: hdfs file actual block paths</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/hdfs-file-actual-block-paths/m-p/121412#M34282</link>
      <description>&lt;P&gt;You could try:&lt;/P&gt;&lt;PRE&gt;curl -i  "http://&amp;lt;HOST&amp;gt;:&amp;lt;PORT&amp;gt;/webhdfs/v1/&amp;lt;PATH&amp;gt;?op=GETFILESTATUS"
    &lt;/PRE&gt;&lt;P&gt;
The client receives a response with a &lt;A href="https://hadoop.apache.org/docs/r1.0.4/webhdfs.html#FileStatus"&gt;FileStatus JSON object&lt;/A&gt;:&lt;/P&gt;&lt;PRE&gt;HTTP/1.1 200 OK
Content-Type: application/json
Transfer-Encoding: chunked

{
  "FileStatus":
  {
    "accessTime"      : 0,
    "blockSize"       : 0,
    "group"           : "supergroup",
    "length"          : 0,             //in bytes, zero for directories
    "modificationTime": 1320173277227,
    "owner"           : "webuser",
    "pathSuffix"      : "",
    "permission"      : "777",
    "replication"     : 0,
    "type"            : "DIRECTORY"    //enum {FILE, DIRECTORY}
  }
}
    &lt;/PRE&gt;</description>
      <pubDate>Sun, 10 Jul 2016 06:39:13 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/hdfs-file-actual-block-paths/m-p/121412#M34282</guid>
      <dc:creator>mliem</dc:creator>
      <dc:date>2016-07-10T06:39:13Z</dc:date>
    </item>
    <item>
      <title>Re: hdfs file actual block paths</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/hdfs-file-actual-block-paths/m-p/121413#M34283</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/11718/leonli.html" nodeid="11718"&gt;@Leon L&lt;/A&gt;, the easiest way to do so from the command line, if you are an administrator, is run the 'fsck' command with the &lt;EM&gt;-files -blocks -locations&lt;/EM&gt; options. e.g.&lt;/P&gt;&lt;PRE&gt;$ hdfs fsck /myfile.txt -files -blocks -locations
Connecting to namenode via &lt;A href="http://localhost:50070" target="_blank"&gt;http://localhost:50070&lt;/A&gt;
FSCK started by someuser (auth:SIMPLE) from /127.0.0.1 for path /myfile.txt at Sun Jul 10 17:55:32 PDT 2016
/myfile.txt 875664 bytes, 1 block(s):  OK
0. BP-810817926-127.0.0.1-1468198364624:blk_1073741825_1001 len=875664 repl=1 [127.0.0.1:50010]
&lt;/PRE&gt;&lt;P&gt;This will return a list of blocks along with which DataNodes that have the replicas of each block. This is a one off solution if you need to get the block locations for a small number of files. There is no publicly available API to query the block locations for a file that I know of.&lt;/P&gt;&lt;P&gt;Could you please explain your use case?&lt;/P&gt;</description>
      <pubDate>Mon, 11 Jul 2016 08:16:06 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/hdfs-file-actual-block-paths/m-p/121413#M34283</guid>
      <dc:creator>ArpitAgarwal</dc:creator>
      <dc:date>2016-07-11T08:16:06Z</dc:date>
    </item>
    <item>
      <title>Re: hdfs file actual block paths</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/hdfs-file-actual-block-paths/m-p/121414#M34284</link>
      <description>&lt;P&gt;Any solution using HDFS API ????????&lt;/P&gt;</description>
      <pubDate>Sat, 10 Feb 2018 16:17:21 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/hdfs-file-actual-block-paths/m-p/121414#M34284</guid>
      <dc:creator>YoxBox</dc:creator>
      <dc:date>2018-02-10T16:17:21Z</dc:date>
    </item>
  </channel>
</rss>

