Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

distcp --> update replication-factor and block-size as per destination cluster

Highlighted

distcp --> update replication-factor and block-size as per destination cluster

New Contributor

When I copy a file from one cluster to another using distcp, it preserves the replication-factor and block-size by default. For example:-

When I copy a file from Cluster A with replication-factor 2 and block-size 64MB to Cluster B with default replication-factor 3 and block-size 128MB, the file's replication-factor and block-size are the same as in case of Cluster A, i.e, 2 and 64MB. I want the file to get the default values of Cluster B.

How can I update the replication-factor and block-size of the file as per the destination cluster?

(Both the Clusters are running HDP-2.6.x)

1 REPLY 1
Highlighted

Re: distcp --> update replication-factor and block-size as per destination cluster

Hey @Himanshu Rawat!

I made a test here, and guess you can use -D dfs.replication to your distpc.

Here's the following command:

[hdfs@node4 ~]$ hadoop distcp -D dfs.replication=1 /largefile.img webhdfs://<MyOtherHDP>:50070/

Hope this helps!

Don't have an account?
Coming from Hortonworks? Activate your account here