I have a file in HDFS which has 8 billion records and when we are flushing it into a internal table we encountered the following error.
HdfsIOException: Build pipeline to recovery block [block pool ID: BP-2080382728-10.3.50.10-1444849419015 block ID 1076905963_3642418] failed: all datanodes are bad.
we tried setting pipeline recovery parameters dfs.client.block.write.replace-datanode-on-failure.enable to true,
dfs.client.block.write.replace-datanode-on-failure.policy to DEFAULT and
dfs.client.block.write.replace-datanode-on-failure.best-effort to true( and we know setting this will lead to data loss in case when all data nodes go down) but we still wanted to give a try and run the our insert process smoothly .However, this also didn't worked.
Can anyone suggest me what could be the possible reason to this error and how this can be fixed?.
Your help is greatly appreciated
Thanks