Reply
Explorer
Posts: 6
Registered: ‎03-29-2016

When to use distcp -strategy dynamic and why isn't it the Hadoop default?

We are testing distcp with -strategy dynamic and notice substantial performance improvements with our workload.  I've looked through online documentation and I can't find any clear answers as to when it should or shouldn't be used, only that it is better for MOST workloads.

 

The questions I have:

Is there any cases where you would not want to use it?

Are there cases (small file size, etc.) where -strategy uniform would outperform it?

What are the potential downsides?

If it is so much better, why isn't it the Hadoop default?

 

Announcements