Support Questions
Find answers, ask questions, and share your expertise

HBase Flush?


HBase Flush?

Champion Alumni



I'm having a coordinator that is executed every day that is writing in HBase tables.

Yesterday the job failed because:


12477375 [hconnection-0x17444a28-shared--pool954-t772] INFO org.apache.hadoop.hbase.client.AsyncProcess  
- #3792, table=table_name, attempt=31/35 failed 1 ops,
last exception: org.apache.hadoop.hbase.NotServingRegionException:
Region table_name,dsqdqs|122A48C3-,1439883135077.f07d81b4d4ff8e9d4170cce187fc2027.
is not online on <IP>,60020,1447053312111 at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName( at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion( at org.apache.hadoop.hbase.regionserver.HRegionServer.multi( at org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod( at at at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop( at org.apache.hadoop.hbase.ipc.RpcExecutor$ at


I did the folowings checks:

- hbase hbck ==> no error

- hbase fsck / ==> no error

- major_compact 'table_name' ==> I managed to run the job


However, even if the workflow finished succesfully, there is no data wroto to hbase tables.


I tried:

- flush 'table_name' ==> didn't changed anything.


Do you have any suggestions on why the data is not wrote? 

(I tried the flush command because I supposed that the files are not written)





Re: HBase Flush?

Master Guru
Are you able to run a org.apache.hadoop.hbase.mapreduce.RowCounter job on this table successfully?

A few reasons you may see NotServingRegionException is when there are ongoing splits, or balancer-invoked moves of regions. However, these should ideally only last under a minute in good cases, so the 35 retries exhausted seems to be in excess.

I'd recommend searching the history of your region ID (f07d81b4d4ff8e9d4170cce187fc2027) in the HMaster log, and then in the RS log of the one that hosted it (per the HMaster Web UI), to see what went wrong (or on) during the timeframe.

Re: HBase Flush?

Champion Alumni



The row counter was not going successfuly. I manage to make it work only after I manually made a major_compact on this table and I balanced the regions over the Region Servers. However, the tables balancing is taked about 20-30 minutes. 


In Cloudera Manager I had the configuration to make a major_compact every 2 days.


How can I be sure that I won't have this situation again?


Thank you! 


Re: HBase Flush?

Champion Alumni

I searched for the root problem and I found this:

DatanodeRegistration(<ip>, datanodeUuid=5a1b56f4-34ac-48da-bfd1-5b8107c26705, 
infoPort=50075, ipcPort=50020,
storageInfo=lv=-56;cid=cluster18;nsid=1840079577;c=0): Got exception while serving BP-1623273649-ip-1419337015794:
blk_1076874382_3137087 to /<ip>:49919 480000 millis timeout while waiting for channel to be ready for write. ch : java.nio.channels.SocketChannel[connected local=/<ip>:50010 remote=/<ip>:49919] at at at at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendPacket( at org.apache.hadoop.hdfs.server.datanode.BlockSender.sendBlock( at org.apache.hadoop.hdfs.server.datanode.DataXceiver.readBlock( at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opReadBlock( at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp( at at View Log File

In fact, the RegionServer Got down for 4 minutes, then the job failed with

NotServingRegionException (the exception that I posted in my first post)

 Is the real solution to increase the dfs.datanode.socket.write.timeout as is posted in  and ?


In fact what is even strager is that in the error I have

SocketTimeoutException: 480000 millis

And in my actual configuration I have in HDFS -> Service Monitor Client Config Overrides



Also, in the JIRA , Uma Maheswara Rao G   said that "In our observation this issue came in long run with huge no of blocks in Data Node".


In my case we have between 56334 and 80512 blocks per DataNode.  Is this considered as huge?



Thank you!

Don't have an account?