Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

how to Find Block Locations of a file in HDFS using JAVA API?

avatar
Visitor

Hello,

i want to get details "Block-Locations" of a particular file say abc.csv(500mb)(Cluster: 1NM and 3DNs).

when i -put a file it is divided into blocks of default size 64MB and spread across Hadoop Cluster.

By using Web interface "http://namenode:50070" we can find out block location across cluster.

Also by using command : hadoop fsck <file-pat> -files -blocks -locations

But what i am trying to achieve is to get these information through JAVA API or WEB- API.

Please let me know the solution if any.

Any help will be appretiated.

1 ACCEPTED SOLUTION

avatar
Expert Contributor

@Yog Prabhhu, you can get the file block information from WebHDFS REST API like

curl -i "http://<HOST>:<PORT>/webhdfs/v1/<FilePath>?op=GETFILEBLOCKLOCATIONS

The corresponding JAVA API is FileSystem.getFileBlockLocations:

public BlockLocation[] getFileBlockLocations(FileStatus file,
    long start, long len)

You will get an array of block locations like below:

[BlockLocation(offset: 0, length: BLOCK_SIZE,*   hosts: {"host1:9866", "host2:9866, host3:9866"},...,]

View solution in original post

3 REPLIES 3

avatar
Expert Contributor

@Yog Prabhhu, you can get the file block information from WebHDFS REST API like

curl -i "http://<HOST>:<PORT>/webhdfs/v1/<FilePath>?op=GETFILEBLOCKLOCATIONS

The corresponding JAVA API is FileSystem.getFileBlockLocations:

public BlockLocation[] getFileBlockLocations(FileStatus file,
    long start, long len)

You will get an array of block locations like below:

[BlockLocation(offset: 0, length: BLOCK_SIZE,*   hosts: {"host1:9866", "host2:9866, host3:9866"},...,]

avatar
Frequent Visitor

Hey, do you have an idea how I can actually locate and read a specific block using BlockLocation? I want to read the block byte by byte.(I understand it might be a remote read)

avatar
Community Manager

@husseljo, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.



Regards,

Vidya Sargur,
Community Manager


Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community: