Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

What is the difference between volumes and folders?

avatar
Contributor

Hi experts,

Can someone please explain the difference between volumes and folders in hadoop?

 

Thanks,

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Hi @ryu 

Volume:

As described in HDFS architecture, the NameNode stores metadata while the DataNodes store the actual data content. Each DataNode is a computer which usually consists of multiple disks (in HDFS’ terminology, volumes). A file in HDFS contains one or more blocks. A block has one or multiple copies (called Replicas), based on the configured replication factor. A replica is stored on a volume of a DataNode, and different replicas of the same block are stored on different DataNodes. 
 
Directory(usually don't say it folders):
like other file system, hdfs directory is hierarchical file structure
 
Regards,
Will

View solution in original post

2 REPLIES 2

avatar
Master Collaborator

Hi @ryu 

Volume:

As described in HDFS architecture, the NameNode stores metadata while the DataNodes store the actual data content. Each DataNode is a computer which usually consists of multiple disks (in HDFS’ terminology, volumes). A file in HDFS contains one or more blocks. A block has one or multiple copies (called Replicas), based on the configured replication factor. A replica is stored on a volume of a DataNode, and different replicas of the same block are stored on different DataNodes. 
 
Directory(usually don't say it folders):
like other file system, hdfs directory is hierarchical file structure
 
Regards,
Will

avatar
Contributor

Hi @willx ,
Is there a way to see if the hadoop path is a volume or a directory?