Support Questions
Find answers, ask questions, and share your expertise

How do we fix the build pipeline recovery block error?

New Contributor

I have a file in HDFS which has 8 billion records and when we are flushing it into a internal table we encountered the following error.

HdfsIOException: Build pipeline to recovery block [block pool ID: BP-2080382728-10.3.50.10-1444849419015 block ID 1076905963_3642418] failed: all datanodes are bad.

we tried setting pipeline recovery parameters dfs.client.block.write.replace-datanode-on-failure.enable to true,

dfs.client.block.write.replace-datanode-on-failure.policy to DEFAULT and

dfs.client.block.write.replace-datanode-on-failure.best-effort to true( and we know setting this will lead to data loss in case when all data nodes go down) but we still wanted to give a try and run the our insert process smoothly .However, this also didn't worked.

Can anyone suggest me what could be the possible reason to this error and how this can be fixed?.

Your help is greatly appreciated

Thanks

1 REPLY 1

@Sri Kalyan Kasyap Vundrakonda

How many nodes do you have in the cluster and whats the replication factor? Is it set to default 3?

Try decreasing the replication factor. This link might help

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.