We currently have 3 Datanodes with 9 disks of 1 TB space having 9 partition for each disk, we are planning to add 2 new Datanodes with 3 disks of 2 TB space. Do I need to configure the new datanodes to have 9 partitions like the old nodes or 6 partitions with 1 TB space.
Though it is expected to have uniform disk configiuration across datanodes in cluster, you can have two different sets of disk confgiration on DNs.
You can have one parition of 2TB size on each disk (3 *2TB =6TB on each DN) even though existing has 1 TB size of partition on each disk across all 9 disks (9*1TB=9TB on each DN).
There will be no issue running DNs with such configiration, but you may see 6TB size DNs are filling faster than 9TB size DNs due to the fact that, NN doesn't consider availabel free space on DNs before writing blocks into it. NameNode picks the DN randomly after evaluating network distnace of the DN from client.
Hope this helps. Thank you
Sorry for the late reply, I tested this configuration for having 3 partitions of 2 TB space in the new nodes, like /hdp/hdfs01, /hdp/hdfs02, /hdp/hdfs03 but when I added the nodes to the cluster I saw it create directories under /hdp for other partitions as well eg., /hdp/hdfs04 - /hdp/hdfs09 like in the other old Datanodes, where /hdp is in root partition.
I believe it had created these directories from HDFS configs where the DN directories are /hdp/hdfs01 - /hdp/hdfs09
To resolve your problem, you can create a new config group under hdfs service and add new datanodes with 3 partition to this new config group and keep only those 3 partitions hdp/hdfs01, /hdp/hdfs02, /hdp/hdfs03 under dfs.datanode.data.dir.
By doing so, you will have two different set of config group with different set of configuration like the datanode partitions. You can make further changes as per you requirement.
Make sure you add those 3 partitions only under the new config group.
Please accept this answer if it helped you resolve your issue.
Hello @pauljoshiva You need to add the new nodes with a new config group. One set of DNs in default config group (where the storage directories are laid from /hdp/hdfs01 - /hdp/hdfs09) and anotehr set of DNs in new config group (with directories /hdp/hdfs01, /hdp/hdfs02, /hdp/hdfs03). That way you can have all DNs added to cluster with 2 separate config groups.