How to improve distcp performance

I want to run distcp job copying huge data from source cluster to destination cluster , how can i increase the performance or speed of the distcp job ?


You can find here all the options for distcp :

To improve that, i used -strategy dynamic and increased the number of the mappers (-m ) also the bandwith per mapper (-bandwith) ans the size of your containers of course if you want customize it.

so you finally have :

hadoop distcp -prb -bandwidth 50 -m 16 -update -delete -strategy dynamic hdfs://source/path/.snapshot/20181030-170124.063 swebhdfs://target/path