Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here. Want to know more about what has changed? Check out the Community News blog.

how to Find Block Locations of a file in HDFS using JAVA API?

SOLVED Go to solution
Highlighted

how to Find Block Locations of a file in HDFS using JAVA API?

New Contributor

Hello,

i want to get details "Block-Locations" of a particular file say abc.csv(500mb)(Cluster: 1NM and 3DNs).

when i -put a file it is divided into blocks of default size 64MB and spread across Hadoop Cluster.

By using Web interface "http://namenode:50070" we can find out block location across cluster.

Also by using command : hadoop fsck <file-pat> -files -blocks -locations

But what i am trying to achieve is to get these information through JAVA API or WEB- API.

Please let me know the solution if any.

Any help will be appretiated.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: how to Find Block Locations of a file in HDFS using JAVA API?

Rising Star

@Yog Prabhhu, you can get the file block information from WebHDFS REST API like

curl -i "http://<HOST>:<PORT>/webhdfs/v1/<FilePath>?op=GETFILEBLOCKLOCATIONS

The corresponding JAVA API is FileSystem.getFileBlockLocations:

public BlockLocation[] getFileBlockLocations(FileStatus file,
    long start, long len)

You will get an array of block locations like below:

[BlockLocation(offset: 0, length: BLOCK_SIZE,*   hosts: {"host1:9866", "host2:9866, host3:9866"},...,]
1 REPLY 1

Re: how to Find Block Locations of a file in HDFS using JAVA API?

Rising Star

@Yog Prabhhu, you can get the file block information from WebHDFS REST API like

curl -i "http://<HOST>:<PORT>/webhdfs/v1/<FilePath>?op=GETFILEBLOCKLOCATIONS

The corresponding JAVA API is FileSystem.getFileBlockLocations:

public BlockLocation[] getFileBlockLocations(FileStatus file,
    long start, long len)

You will get an array of block locations like below:

[BlockLocation(offset: 0, length: BLOCK_SIZE,*   hosts: {"host1:9866", "host2:9866, host3:9866"},...,]