Created 08-20-2018 12:59 PM
Reference : https://stackoverflow.com/questions/51930336/how-datanode-path-is-created-in-hadoop
How do hdfs generates datanode path? Say for example I'm asking hdfs to write a file in hdfs location hdfs://192.168.143.150:9000/filestore/TAO.mp4 it is getting written in the path "/data/hadoop-data/dn/current/BP-1308070615-172.22.131.23-1533215887051/current/finalized/subdir0/subdir0/blk_1073741869".
How this path gets generated?
Created 08-21-2018 10:24 AM
Combining @hdpadmin 's answer and @Jonathan Sneep's answer gives me the best answer to my question 🙂 Thank you both 🙂
Created 08-20-2018 02:57 PM
@rabbit,
the path is already defined on Ambari (hdfs -> configs -> DataNode Directories) this is the path you defined for HDFS to write the block information on the actual disk location on each data node.
in your case: This path must have defined on ambari as: /data/hadoop-data/dn/ - under this, HDFS creates remaining folders starting from "current"
Please check your ambari -> hdfs properties and confirm.
I hope this help you.
Created 08-21-2018 09:28 AM
Thanks @hdpadmin. You were right I have configured "/data/hadoop-data/dn" in hadoop-site.xml. But why does hdfs create multiple subfolders within the current folder like "BP-1308070615-172.22.131.23-1533215887051/current/finalized/subdir0/subdir0"? What does the sub directory name mean?
Created 08-21-2018 09:41 AM
Hi @rabbit s
The BP stands for "block pool", a collection of blocks belonging to a single HDFS namespace.
The next part 1308070615, is a randomly generated integer.
The IP address is the address of the NameNode that originally created the block pool
The last part is the creation time of the namespace.
You can read more about this here;
https://hortonworks.com/blog/hdfs-metadata-directories-explained/
Created 08-21-2018 10:22 AM
This is exactly what I was looking for ! Thanks a lot @Jonathan Sneep
Created 08-21-2018 10:24 AM
Combining @hdpadmin 's answer and @Jonathan Sneep's answer gives me the best answer to my question 🙂 Thank you both 🙂
Created 08-21-2018 10:26 AM
Why does this again creating "current/finalized/subdir0/subdir0" ?