- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
New Disk in Data Node
- Labels:
-
HDFS
Created ‎04-29-2015 11:29 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello,
We have a 12 node cluster with 4 data drives in them each. We added 1 additional disk to each data node but are having a hard time getting data to be placed on these additional drives. For example:
node01 -
/hdfs/01 - 1.4 TiB/1.8 TiB
/hdfs/02 - 1.5 TiB/1.8 TiB
/hdfs/03 - 1.5 TiB/1.8 TiB
/hdfs/04 - 1.5 TiB/1.8 TiB
/hdfs/05 - 18 GiB/1.8 TiB
node03 -
/hdfs/01 - 1.5 TiB/1.8 TiB
/hdfs/02 - 1.5 TiB/1.8 TiB
/hdfs/03 - 1.5 TiB/1.8 TiB
/hdfs/04 - 1.5 TiB/1.8 TiB
/hdfs/05 - 181.9 GiB/1.8 TiB
We've run the rebalancer for a few weeks now. Would it be easier to just wipe one data node at a time and introduce it back into the cluster?
Thanks,
Eric
Created ‎04-29-2015 12:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The HDFS Balancer only balances blocks between DataNodes. It does not to any balancing on individual DataNodes between drives.
You can set the DataNode Volume Choosing Policy (dfs.datanode.fsdataset.volume.choosing.policy) to Available Space (org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy). This will cause the DataNodes to write new blocks to the drive with the most space available. It does not affect blocks that have already been written.
For your question about wiping one datanode at a time, it would be better to decommission and then recommission a node. With a replication factor of 3 you may perform this action on 2 nodes at a time.
David Wilder, Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created ‎04-29-2015 12:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The HDFS Balancer only balances blocks between DataNodes. It does not to any balancing on individual DataNodes between drives.
You can set the DataNode Volume Choosing Policy (dfs.datanode.fsdataset.volume.choosing.policy) to Available Space (org.apache.hadoop.hdfs.server.datanode.fsdataset.AvailableSpaceVolumeChoosingPolicy). This will cause the DataNodes to write new blocks to the drive with the most space available. It does not affect blocks that have already been written.
For your question about wiping one datanode at a time, it would be better to decommission and then recommission a node. With a replication factor of 3 you may perform this action on 2 nodes at a time.
David Wilder, Community Manager
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.
Learn more about the Cloudera Community:
Created ‎04-29-2015 12:22 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
