05-15-2015 07:55 AM
I have read so far from various sources that HDFS's roles and their corresponfing HDFS mount points configred as follows:
<< NN (NameNode) >>
Runs on a single Master server with a couple HDs (3 or 4x300GB HDs). [These are JBOD disks]
This is where NameNode Data Directories (dfs.name.dir, dfs.namenode.name.dir) are defined:
<< SNN (Secondary NameNode) >>
Runs on a single SecodaryMaster server with a couple HDs (3 or 4x300GB HDs). [These are JBOD disks]
This is where HDFS Checkpoint Directories (fs.checkpoint.dir, dfs.namenode.checkpoint.dir) are defined:
<< DN (DataNode) >>
Runs on many DataNode servers wehere each host has many HDs (20 or so x300GB HDs). [These are JBOD disks]
05-28-2015 09:13 PM
In general NN and SNN do not require too much storage. While with DN it's not uncommon to see many hard disks used for data drives (like 12-24). The number depends on your storage requirements.
You might be interested to read this blog post about hardware guidelines. Keep in mind the numbers are from 2013 but the same principles apply.