Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

What is the difference between volumes and folders?

Contributor

Hi experts,

Can someone please explain the difference between volumes and folders in hadoop?

 

Thanks,

1 ACCEPTED SOLUTION

Rising Star

Hi @ryu 

Volume:

As described in HDFS architecture, the NameNode stores metadata while the DataNodes store the actual data content. Each DataNode is a computer which usually consists of multiple disks (in HDFS’ terminology, volumes). A file in HDFS contains one or more blocks. A block has one or multiple copies (called Replicas), based on the configured replication factor. A replica is stored on a volume of a DataNode, and different replicas of the same block are stored on different DataNodes. 
 
Directory(usually don't say it folders):
like other file system, hdfs directory is hierarchical file structure
 
Regards,
Will

View solution in original post

2 REPLIES 2

Rising Star

Hi @ryu 

Volume:

As described in HDFS architecture, the NameNode stores metadata while the DataNodes store the actual data content. Each DataNode is a computer which usually consists of multiple disks (in HDFS’ terminology, volumes). A file in HDFS contains one or more blocks. A block has one or multiple copies (called Replicas), based on the configured replication factor. A replica is stored on a volume of a DataNode, and different replicas of the same block are stored on different DataNodes. 
 
Directory(usually don't say it folders):
like other file system, hdfs directory is hierarchical file structure
 
Regards,
Will

Contributor

Hi @willx ,
Is there a way to see if the hadoop path is a volume or a directory?

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.