Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Does the parameter of Replication impact the run speed of write data in the hdfs?

Does the parameter of Replication impact the run speed of write data in the hdfs?

New Contributor

I got the below 3 parameter about the run speed of Replication from the cloudera's document: Decommissioning and Recommissioning Hosts(https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cm_mc_decomm_host.html). 1. Replication Work Multiplier Per Iteration 2. Maximum number of replication threads on a DataNode 3. Hard limit on the number of replication threads on a DataNode I want to understand if these 3 parameter changing impact the run speed of common write data in the hdfs? I would be grateful if someone could provide the reply. Thanks.

4 REPLIES 4

Re: Does the parameter of Replication impact the run speed of write data in the hdfs?

Master Guru
The mentioned parameters only apply to background re-replication work, such
as during decommissioning, host or disk failures, change in replication
factor of existing files, etc.. The configs exist to ensure such background
activity does not overwhelm the cluster suddenly, and thereby carry
conservative values.

HDFS client writes of new data is typically unthrottled, and requires no
adjustment of any capping configuration to perform better.

Are you finding your writes to be slower than expected in the environment?
What measurement points are suggesting this?

Re: Does the parameter of Replication impact the run speed of write data in the hdfs?

New Contributor

Hi Harsh,

Thank your reply. Due to I doubted when the run of write would generate the 3 replica in the hdfs filesystem(refer to the below figure), these 3 parameters changing might impact the speed of replica generate of run of write. So according your reply, Do the replica generate of run of write be not impacted by these 3 parameters?

 

SNAG-1588.png

Re: Does the parameter of Replication impact the run speed of write data in the hdfs?

New Contributor

Hi Harsh,

Thank your reply. So according your reply, Do the replica generate of run of write be not impacted by these 3 parameters(1. Replication Work Multiplier Per Iteration 2. Maximum number of replication threads on a DataNode 3. Hard limit on the number of replication threads on a DataNode)?

Thanks.

Highlighted

Re: Does the parameter of Replication impact the run speed of write data in the hdfs?

Master Guru
Yes, that is correct. Clients write the replicas in 'real time'
synchronously and are not throttled by any of those background work configs.