Support Questions
Find answers, ask questions, and share your expertise
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

EBS volume resize to smaller size, is there any faster way to copy the data?



We currently have 13 EBS disks volumes in our one datanode, we have around 10 such datanodes

each disk of the data node is of 1.5 TB used. We want to copy 1.5TB of volume to new disk.

copying 100GB take about an hour or more. so 1.5TB will take so many hours to copy. Is there any faster way to copy the data?

I am using rsync command to copy to new disk.


Super Collaborator

Hi @Madhura Mhatre,

This may not the perfect answer but you can try this way:

First check the below process in one data node, if it work's perfectly please replicate it others

1. Create the new config group under one data node

for example you have configured /data1(1.5 T.B) datanode

2. overwrite the dfs.datanode.dir config parameter in new config group

Remove /data1 and add /data2(1 T.B) /data3 (1 T.B)

3. save the changes and restart required services.

4. blocks are automatically copied from other data nodes since old drive is missing configuration

Note: Cluster may get slowness due to heavy data lifting from other datanodes.


@Madhura Mhatre

The fastest way to copy files between 2 EBS volumes attached to the same instance, it's fastest if you can unmount both drives or at least remount the first one as read-only, the fastest way would be to use 'dd' to copy everything (including filesystem structures).

dd if=/dev/device1 of=/dev/device2 

Since you are copying to a bigger volume, you might want to run 'resize2fs /dev/device2' after that (to expand the filesystem).


Hey @Geoffrey Shelton Okot . I tried this command but it does not work as my current disk is 2.5TB and disk I am copying to is 1.7TB ... and this command requires that it should be of same disk size. Is there any other alternate way?