Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Heterogenous disk volume in a data node

avatar
New Contributor

Hello Everyone,

 

Cluster has multiple data nodes with heterogeneous disk types and volumes mounted on nodes. E.g. 1 HDD disk i.e. /d1/data with 1 TB allocated and 1 SSD disk i.e. /d2/data with 360 GB allocated

 

dfs.datanode.du.reversed was set to default (10 GB). This configuration has led to space getting filled on SSD upto 345 GB and HDD is still less than 50% used.

 

I changed dfs.datanode.du.reversed to 50 GB and ran balancer. But seems balancer does balancing b/w nodes and not volumes within nodes. Can I manually move data from /d1/data to /d2/data to free space on SSD disk? If yes, please suggest how that can be done?

 

Thanks in advance.

 

Cheers

1 ACCEPTED SOLUTION

avatar
Mentor

For effectively managing specific usage of different disk hardware types, HDFS offers a Heterogenous Storage Management feature, documented at https://www.cloudera.com/documentation/enterprise/latest/topics/admin_heterogeneous_storage_oview.ht.... Is that what you are looking for?

 

The HDFS mover tool available for use with this feature, along with the storage policies that dictate replica placement selection will let you manage the differing hardware better.

 

If you'd still like to move blocks over manually and not configure the HSM feature, then please read the method and caveat presented under http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the...

View solution in original post

1 REPLY 1

avatar
Mentor

For effectively managing specific usage of different disk hardware types, HDFS offers a Heterogenous Storage Management feature, documented at https://www.cloudera.com/documentation/enterprise/latest/topics/admin_heterogeneous_storage_oview.ht.... Is that what you are looking for?

 

The HDFS mover tool available for use with this feature, along with the storage policies that dictate replica placement selection will let you manage the differing hardware better.

 

If you'd still like to move blocks over manually and not configure the HSM feature, then please read the method and caveat presented under http://wiki.apache.org/hadoop/FAQ#On_an_individual_data_node.2C_how_do_you_balance_the_blocks_on_the...