Support Questions
Find answers, ask questions, and share your expertise

When to use distcp -strategy dynamic and why isn't it the Hadoop default?


We are testing distcp with -strategy dynamic and notice substantial performance improvements with our workload.  I've looked through online documentation and I can't find any clear answers as to when it should or shouldn't be used, only that it is better for MOST workloads.


The questions I have:

Is there any cases where you would not want to use it?

Are there cases (small file size, etc.) where -strategy uniform would outperform it?

What are the potential downsides?

If it is so much better, why isn't it the Hadoop default?