- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How hdfs generates datanode path?
- Labels:
-
Apache Hadoop
Created ‎08-20-2018 12:59 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Reference : https://stackoverflow.com/questions/51930336/how-datanode-path-is-created-in-hadoop
How do hdfs generates datanode path? Say for example I'm asking hdfs to write a file in hdfs location hdfs://192.168.143.150:9000/filestore/TAO.mp4 it is getting written in the path "/data/hadoop-data/dn/current/BP-1308070615-172.22.131.23-1533215887051/current/finalized/subdir0/subdir0/blk_1073741869".
How this path gets generated?
Created ‎08-21-2018 10:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Combining @hdpadmin 's answer and @Jonathan Sneep's answer gives me the best answer to my question 🙂 Thank you both 🙂
Created ‎08-20-2018 02:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@rabbit,
the path is already defined on Ambari (hdfs -> configs -> DataNode Directories) this is the path you defined for HDFS to write the block information on the actual disk location on each data node.
in your case: This path must have defined on ambari as: /data/hadoop-data/dn/ - under this, HDFS creates remaining folders starting from "current"
Please check your ambari -> hdfs properties and confirm.
I hope this help you.
Created ‎08-21-2018 09:28 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks @hdpadmin. You were right I have configured "/data/hadoop-data/dn" in hadoop-site.xml. But why does hdfs create multiple subfolders within the current folder like "BP-1308070615-172.22.131.23-1533215887051/current/finalized/subdir0/subdir0"? What does the sub directory name mean?
Created ‎08-21-2018 09:41 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @rabbit s
The BP stands for "block pool", a collection of blocks belonging to a single HDFS namespace.
The next part 1308070615, is a randomly generated integer.
The IP address is the address of the NameNode that originally created the block pool
The last part is the creation time of the namespace.
You can read more about this here;
https://hortonworks.com/blog/hdfs-metadata-directories-explained/
Created ‎08-21-2018 10:22 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is exactly what I was looking for ! Thanks a lot @Jonathan Sneep
Created ‎08-21-2018 10:24 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Combining @hdpadmin 's answer and @Jonathan Sneep's answer gives me the best answer to my question 🙂 Thank you both 🙂
Created ‎08-21-2018 10:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Why does this again creating "current/finalized/subdir0/subdir0" ?
