Reply
Explorer
Posts: 22
Registered: ‎04-12-2016

DistCp compacted files takes a lot of time

Hi,

I'm using a compaction MR job that compacts the small files my MR jobs may create.

 

When i compact the files i got files with >100GB, DistCp such files with 1 mapper between 2 farms may take more than 5 hours, and i have alot of such compacted files.

That means that DistCp 1 day created files may takes days and i'm always with huge backlog between the active farm and backup DR farm.

 

Any suggested solution?

 

Thanks in advance.

 

 

Posts: 177
Topics: 8
Kudos: 28
Solutions: 19
Registered: ‎07-16-2015

Re: DistCp compacted files takes a lot of time

Did you check the network bandwith beetwen your clusters ?

Did you check the "allocated" bandwith for DistCp ? I think the allocated bandwith is very low per default. Try increasing that parameter first.

 

But definetly, check that there is enough network bandwith.