07-27-2016 01:00 PM
We are testing distcp with -strategy dynamic and notice substantial performance improvements with our workload. I've looked through online documentation and I can't find any clear answers as to when it should or shouldn't be used, only that it is better for MOST workloads.
The questions I have:
Is there any cases where you would not want to use it?
Are there cases (small file size, etc.) where -strategy uniform would outperform it?
What are the potential downsides?
If it is so much better, why isn't it the Hadoop default?