- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
how to Find Block Locations of a file in HDFS using JAVA API?
- Labels:
-
Apache Hadoop
Created 02-10-2018 08:14 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
i want to get details "Block-Locations" of a particular file say abc.csv(500mb)(Cluster: 1NM and 3DNs).
when i -put a file it is divided into blocks of default size 64MB and spread across Hadoop Cluster.
By using Web interface "http://namenode:50070" we can find out block location across cluster.
Also by using command : hadoop fsck <file-pat> -files -blocks -locations
But what i am trying to achieve is to get these information through JAVA API or WEB- API.
Please let me know the solution if any.
Any help will be appretiated.
Created 02-13-2018 07:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Yog Prabhhu, you can get the file block information from WebHDFS REST API like
curl -i "http://<HOST>:<PORT>/webhdfs/v1/<FilePath>?op=GETFILEBLOCKLOCATIONS
The corresponding JAVA API is FileSystem.getFileBlockLocations:
public BlockLocation[] getFileBlockLocations(FileStatus file, long start, long len)
You will get an array of block locations like below:
[BlockLocation(offset: 0, length: BLOCK_SIZE,* hosts: {"host1:9866", "host2:9866, host3:9866"},...,]
Created 02-13-2018 07:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Yog Prabhhu, you can get the file block information from WebHDFS REST API like
curl -i "http://<HOST>:<PORT>/webhdfs/v1/<FilePath>?op=GETFILEBLOCKLOCATIONS
The corresponding JAVA API is FileSystem.getFileBlockLocations:
public BlockLocation[] getFileBlockLocations(FileStatus file, long start, long len)
You will get an array of block locations like below:
[BlockLocation(offset: 0, length: BLOCK_SIZE,* hosts: {"host1:9866", "host2:9866, host3:9866"},...,]
Created on 08-07-2022 06:02 PM - edited 08-07-2022 06:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hey, do you have an idea how I can actually locate and read a specific block using BlockLocation? I want to read the block byte by byte.(I understand it might be a remote read)
Created 08-08-2022 03:17 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@husseljo, as this is an older post, you would have a better chance of receiving a resolution by starting a new thread. This will also be an opportunity to provide details specific to your environment that could aid others in assisting you with a more accurate answer to your question. You can link this thread as a reference in your new post.
Regards,
Vidya Sargur,Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
