Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Hbase Region server down frequently

Highlighted

Hbase Region server down frequently

Contributor

hbase-error.txtWe are having 7node cluster, in which 5node have region server running, on daily basis any one of the region server node goes down. and im getting the same error from the respective nodes.please find the attached log.

6 REPLIES 6
Highlighted

Re: Hbase Region server down frequently

@Mathi Murugan

Are all of your data nodes healthy and have enough available disk space? For some reasons writing block to one of them fails and beacuse your replication factor is 2 and replace-datanode-on-failure.policy=DEFAULT, NN will not try another DN and write fails. So, first make sure your DNs are all right. If they look good then try to set

  1. dfs.client.block.write.replace-datanode-on-failure.policy=ALWAYS
  2. dfs.client.block.write.replace-datanode-on-failure.best-effort=true
Highlighted

Re: Hbase Region server down frequently

Contributor

All of my datanodes are healthy and having enough space.My replication factor is 3 as default. By setting dfs.client.block.write.replace-datanode-on-failure.best-effort=true, will not result in data loss?.kindly suggest

Highlighted

Re: Hbase Region server down frequently

Contributor

@nshelke

All of my datanodes are healthy and having enough space.My replication factor is 3 as default. By setting dfs.client.block.write.replace-datanode-on-failure.best-effort=true, will not result in data loss?.kindly suggest

Highlighted

Re: Hbase Region server down frequently

@Mathi Murugan

It will not result in data loss, Can you try setting above properties and check again.

Highlighted

Re: Hbase Region server down frequently

Contributor

hmas.txtregionserver.txtHi nshelke,

I had Set this properties as you mentioned , but Hbase-Master and Region server are getting down and there is a backup process running behind this.It too fails.

dfs.client.block.write.replace-datanode-on-failure.policy=ALWAYS

dfs.client.block.write.replace-datanode-on-failure.best-effort=true

Please find the attached Log of Hbase master and Regions server.

Kindly suggest.

Highlighted

Re: Hbase Region server down frequently

New Contributor

From your logs I see there are no healthy datanodes for it to try replace bad datanodes. In addition I see several slow sync error for which you will have to tune your memstore's lower and upper limit configuration to reduce the frequency of data being flushed in order to get the best out of available heap.

Don't have an account?
Coming from Hortonworks? Activate your account here