Created 05-24-2017 06:50 AM
hbase-error.txtWe are having 7node cluster, in which 5node have region server running, on daily basis any one of the region server node goes down. and im getting the same error from the respective nodes.please find the attached log.
Created 05-24-2017 07:17 AM
Are all of your data nodes healthy and have enough available disk space? For some reasons writing block to one of them fails and beacuse your replication factor is 2 and replace-datanode-on-failure.policy=DEFAULT, NN will not try another DN and write fails. So, first make sure your DNs are all right. If they look good then try to set
Created 05-24-2017 09:48 AM
All of my datanodes are healthy and having enough space.My replication factor is 3 as default. By setting dfs.client.block.write.replace-datanode-on-failure.best-effort=true, will not result in data loss?.kindly suggest
Created 05-24-2017 09:47 AM
All of my datanodes are healthy and having enough space.My replication factor is 3 as default. By setting dfs.client.block.write.replace-datanode-on-failure.best-effort=true, will not result in data loss?.kindly suggest
Created 05-29-2017 02:11 AM
It will not result in data loss, Can you try setting above properties and check again.
Created 06-14-2017 04:56 AM
hmas.txtregionserver.txtHi nshelke,
I had Set this properties as you mentioned , but Hbase-Master and Region server are getting down and there is a backup process running behind this.It too fails.
dfs.client.block.write.replace-datanode-on-failure.policy=ALWAYS
dfs.client.block.write.replace-datanode-on-failure.best-effort=true
Please find the attached Log of Hbase master and Regions server.
Kindly suggest.
Created 04-09-2020 01:26 PM
From your logs I see there are no healthy datanodes for it to try replace bad datanodes. In addition I see several slow sync error for which you will have to tune your memstore's lower and upper limit configuration to reduce the frequency of data being flushed in order to get the best out of available heap.