we get the following in spark logs
java.io.IOException: Failed to replace a bad datanode on the existing pipeline due to no more good datanodes being available to try. (Nodes: current=[DatanodeInfoWithStorage DatanodeInfoWithStorage\
The current failed datanode replacement policy is DEFAULT, and a client may configure this via 'dfs.client.block.write.replace-datanode-on-failure.policy' in its configuration.
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.findNewDatanode(DFSOutputStream.java:1036)
my ambari cluster include only 3 workers machines and each worker have only one data disk
I search in google and find solution can be about:
Block replication need to be set as 1 instead of 3 ( HDFS )
is it true ?
second - because my worker machine have obnly one data disk is it can be part of the problem ?
Block replication = The total number of files in the file system will be what's specified in the dfs.replication factor
setting dfs.replication=1, means will be only one copy of the file in the file system.
Michael-Bronson