- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Is it safe to have nodes with different number of disks?
- Labels:
-
Apache Ambari
-
Apache Hadoop
-
HDFS
Created ‎08-31-2021 08:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi experts,
In my current cluster, I have some datanodes that have only 2 disks and some datanodes that have 3 disks. I was wondering if it is ok to have a different number of disks, but specify in the datanode configs 3 disks.
Also is it ok if some disks are 2T and some disks are 3T?
Any advice is greatly appreciated.
Thanks,
Created on ‎08-31-2021 01:15 PM - edited ‎08-31-2021 01:16 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
With Hadoop 3, there is intra node balance as well as the data nodes balance which can help you distribute and balance the data on your nodes cluster. for sure the recommended way is having all data nodes with same number of disks and size, but its is possible to have different config for data nodes but you will need to keep balancing your data nodes quite often which will take computation and network resources.
Also another thing to consider when you have disks with different size is "data node volume choosing policy" which is by default set to round robin , you need to consider choosing available space instead.
i suggest you to read this article from Cloudera as well.
https://blog.cloudera.com/how-to-use-the-new-hdfs-intra-datanode-disk-balancer-in-apache-hadoop/
Best Regards
