Support Questions

Find answers, ask questions, and share your expertise
Announcements
We’ve updated our product names and community labels - click here for full details

What is the difference between volumes and folders?

avatar
Contributor

Hi experts,

Can someone please explain the difference between volumes and folders in hadoop?

 

Thanks,

1 ACCEPTED SOLUTION

avatar
Master Collaborator

Hi @ryu 

Volume:

As described in HDFS architecture, the NameNode stores metadata while the DataNodes store the actual data content. Each DataNode is a computer which usually consists of multiple disks (in HDFS’ terminology, volumes). A file in HDFS contains one or more blocks. A block has one or multiple copies (called Replicas), based on the configured replication factor. A replica is stored on a volume of a DataNode, and different replicas of the same block are stored on different DataNodes. 
 
Directory(usually don't say it folders):
like other file system, hdfs directory is hierarchical file structure
 
Regards,
Will

View solution in original post

2 REPLIES 2

avatar
Master Collaborator

Hi @ryu 

Volume:

As described in HDFS architecture, the NameNode stores metadata while the DataNodes store the actual data content. Each DataNode is a computer which usually consists of multiple disks (in HDFS’ terminology, volumes). A file in HDFS contains one or more blocks. A block has one or multiple copies (called Replicas), based on the configured replication factor. A replica is stored on a volume of a DataNode, and different replicas of the same block are stored on different DataNodes. 
 
Directory(usually don't say it folders):
like other file system, hdfs directory is hierarchical file structure
 
Regards,
Will

avatar
Contributor

Hi @willx ,
Is there a way to see if the hadoop path is a volume or a directory?