Hi experts, in the the CDH install screens it has a Data Node configuration value:
DataNode Data Directory
It states to use comma-delimited list of directories on the local file system where the DataNode stores HDFS block data. Typical values are /data/N/dfs/dn for N = 1, 2, 3.... and each disk is a JBOD file mount. How do we specify this value if the datanodes have different number of JBOD disks say 20 disks in one and 10 disks in another Datanode. Since during install this is single global variable dfs.data.dir how does it allocate the 20 data directories in those data nodes with only 10 JBOD hard disks? Since there is no hostname defined in this variable to indicate different nunber of disks in different hosts. Also in future if new datanodes are added with different number of disks how is this specified while adding new data nodes?
I posted this question earlier but didnt get a reply so appreciate if you have some info.Thanks!