Reply
New Contributor
Posts: 5
Registered: ‎12-14-2018

Does the parameter of Replication impact the run speed of write data in the hdfs?

I got the below 3 parameter about the run speed of Replication from the cloudera's document: Decommissioning and Recommissioning Hosts(https://www.cloudera.com/documentation/enterprise/5-5-x/topics/cm_mc_decomm_host.html). 1. Replication Work Multiplier Per Iteration 2. Maximum number of replication threads on a DataNode 3. Hard limit on the number of replication threads on a DataNode I want to understand if these 3 parameter changing impact the run speed of common write data in the hdfs? I would be grateful if someone could provide the reply. Thanks.

Highlighted
Posts: 1,826
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Does the parameter of Replication impact the run speed of write data in the hdfs?

The mentioned parameters only apply to background re-replication work, such
as during decommissioning, host or disk failures, change in replication
factor of existing files, etc.. The configs exist to ensure such background
activity does not overwhelm the cluster suddenly, and thereby carry
conservative values.

HDFS client writes of new data is typically unthrottled, and requires no
adjustment of any capping configuration to perform better.

Are you finding your writes to be slower than expected in the environment?
What measurement points are suggesting this?
New Contributor
Posts: 5
Registered: ‎12-14-2018

Re: Does the parameter of Replication impact the run speed of write data in the hdfs?

Hi Harsh,

Thank your reply. Due to I doubted when the run of write would generate the 3 replica in the hdfs filesystem(refer to the below figure), these 3 parameters changing might impact the speed of replica generate of run of write. So according your reply, Do the replica generate of run of write be not impacted by these 3 parameters?

 

SNAG-1588.png

New Contributor
Posts: 5
Registered: ‎12-14-2018

Re: Does the parameter of Replication impact the run speed of write data in the hdfs?

Hi Harsh,

Thank your reply. So according your reply, Do the replica generate of run of write be not impacted by these 3 parameters(1. Replication Work Multiplier Per Iteration 2. Maximum number of replication threads on a DataNode 3. Hard limit on the number of replication threads on a DataNode)?

Thanks.

Posts: 1,826
Kudos: 406
Solutions: 292
Registered: ‎07-31-2013

Re: Does the parameter of Replication impact the run speed of write data in the hdfs?

Yes, that is correct. Clients write the replicas in 'real time'
synchronously and are not throttled by any of those background work configs.
Announcements