Support Questions

cc_yang · ‎03-01-2023

2023-02-23 09:59:54,031 ERROR [RpcServer.default.FPBQ.Fifo.handler=189,queue=9,port=16000] master.MasterRpcServices: Region server hadoop-08,16020,1676022229147 reported a fatal error:
***** ABORTING region server hadoop-08,16020,1676022229147: org.apache.hadoop.hbase.YouAreDeadException: rit=OPEN, location=hadoop-09,16020,1676022233663, table=hh_app_hbase_poc_tag:label_d_common_data_20221229, region=2d10db72bf694a8af42a38f62ae13c7b reported OPEN on server=hadoop-08,16020,1676022229147 but state has otherwise.
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.checkOnlineRegionsReport(AssignmentManager.java:1036)
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.reportOnlineRegions(AssignmentManager.java:960)
at org.apache.hadoop.hbase.master.MasterRpcServices.regionServerReport(MasterRpcServices.java:466)
at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:13118)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:131)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
Caused by: org.apache.hadoop.hbase.exceptions.UnexpectedStateException: rit=OPEN, location=hadoop-09,16020,1676022233663, table=hh_app_hbase_poc_tag:label_d_common_data_20221229, region=2d10db72bf694a8af42a38f62ae13c7b reported OPEN on server=hadoop-08,16020,1676022229147 but state has otherwise.

These are my hmaster error logs. Have you ever encountered this problem and how to solve it？

This is another article I saw. His error is the same as mine

https://community.cloudera.com/t5/Support-Questions/After-Upgrading-to-HDP-3-0-1-Hbase-balancer-stuc...

blizano · ‎03-02-2023

Hello,

org.apache.hadoop.hbase.YouAreDeadException, normally occurs if Region Server lost communication or last too much reporting availability with the znode cause different reasons [1]

You may want to check what is in Region Server logs and check if the zookeeper service is not crashing and if ZK timeouts are properly set [2]

Hope this helps

[1] https://issues.apache.org/jira/browse/HBASE-25274

[2] https://community.cloudera.com/t5/Customer/What-is-the-formula-to-calculate-ZooKeeper-timeouts-for/t...

cc_yang · ‎03-05-2023

Hello，

There is no ZK-related exception information in the log of my region server. Only the stopping server appears after the closed region

The following is the log of my region server:

2023-02-23 09:59:54,281 INFO [RS_CLOSE_REGION-regionserver/hadoop-08:16020-1] regionserver.HRegion: Closed hh_app_hbase_poc_tag:label_d_common_data_20221229,018_1818111,1674980891241.2d10db72bf694a8af42a38f62ae13c7b.
2023-02-23 09:59:54,282 INFO [RS_CLOSE_REGION-regionserver/hadoop-08:16020-0] hbase.RangerAuthorizationCoprocessor: Unable to get remote Address
2023-02-23 09:59:54,457 INFO [regionserver/hadoop-08:16020] regionserver.HRegionServer: stopping server hadoop-08,16020,1676022229147; all regions closed.
2023-02-23 09:59:54,478 WARN [Close-WAL-Writer-304] asyncfs.FanOutOneBlockAsyncDFSOutputHelper: lease for file /apps/hbase/data/WALs/hadoop-08,16020,1676022229147/hadoop-08%2C16020%2C1676022229147.1677116687197 is expired, give up

org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.hdfs.server.namenode.LeaseExpiredException): Client (=DFSClient_NONMAPREDUCE_1647216046_1) is not the lease owner (=DFSClient_NONMAPREDUCE_1918766927_1: /apps/hbase/data/WALs/hadoop-08,16020,1676022229147-splitting/hadoop-08%2C16020%2C1676022229147.1677116687197 (inode 175124591) [Lease. Holder: DFSClient_NONMAPREDUCE_1647216046_1, pending creates: 1].

Cloudera Community

Support Questions

HDP3.1.4 Inconsistent HBase region status results in the region server service downtime