Created 01-22-2023 08:14 PM
Hi Team,
Prod master node is not coming up. Getting below error, could you pls tell me how to resolve the issue as the data is very important.
2023-01-23 09:40:50,748 ERROR [master/ctrlsu-hbaseRS1:16000:becomeActiveMaster] master.HMaster: Failed to become active master
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:379)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:319)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1324)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1055)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2184)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:519)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:249)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1322)
... 4 more
2023-01-23 09:40:50,749 ERROR [master/ctrlsu-hbaseRS1:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: []
2023-01-23 09:40:50,749 ERROR [master/ctrlsu-hbaseRS1:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master ctrlsu-hbasers1,16000,1674446742141: Unhandled exception. Starting shutdown. *****
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:379)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:319)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1324)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1055)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2184)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:519)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:249)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1322)
Created 04-03-2023 09:44 PM
Hello @RammiSE
Your Post is being replied a bit late, yet I am posting a response anyways. Assuming your Team has resolved the Issue, Appreciate your Team sharing the details in the Post for wider audience.
For HMaster to be Initialised, "hbase:meta" & "hbase:namespace" Table Region needs to be Online. In your previous thread, the HMaster is reporting "hbase:meta" isn't Online [1]. As such, Use the HBCK2 JAR to assign the "hbase:meta" Region "1588230740" first & review (Via HBase UI) whether Regions are being assigned successfully. It's feasible the "hbase:namespace" Table Region would also reporting similar tracing, in which case your Team needs to use HBCK2 JAR to assign the "hbase:namespace" Region. Restarting HMaster after manually performing HBCK2 Assign isn't required always, yet the same won't harm as well.
Regards, Smarak
[1]
2023-01-23 16:05:34,990 WARN [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined
Created 01-22-2023 11:54 PM
Hi @RammiSE , Based on the exception, the hbase:namespace table is not online. You will need to assign the namespace region to bring up the Hbase Master.
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_hbase_hbck.html
~~~
Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
Created on 01-23-2023 12:08 AM - edited 01-23-2023 03:00 AM
@rki_ Getting this error after assigns
2023-01-23 16:04:18,310 INFO [hconnection-0x6be3e1e2-shared-pool-1] client.RpcRetryingCallerImpl: Server.getRegionByEncodedName(HRegionServer.java:3462)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
, details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1, see https://s.apache.org/timeout
Created 01-23-2023 12:18 AM
@RammiSE you will need to assign the respective namespace region ID by checking the Hbase Master log using the hbck2 jar
Created 01-23-2023 02:53 AM
@rki_ Getting this error after executing the command "hbase hbck -j .jar assigns f0b4865fe8ea07321ed8eb237a592c10"
2023-01-23 16:04:38,448 INFO [hconnection-0x6be3e1e2-shared-pool-1] client.RpcRetryingCallerImpl: Server.getRegionByEncodedName(HRegionServer.java:3462)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
, details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1, see https://s.apache.org/timeout
2023-01-23 16:05:34,990 WARN [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined.
Created 01-23-2023 05:55 AM
@RammiSE Try the below :
./hbase hbck -j /tmp/hbase-hbck2-1.2.0.jar assigns -o f0b4865fe8ea07321ed8eb237a592c10
Created 01-23-2023 02:36 AM
@rki_ I am executing this command
"hbase hbck -j /tmp/hbase-hbck2-1.2.0.jar assigns f0b4865fe8ea07321ed8eb237a592c"
and getting error . pls guide me the next steps
Exception in thread "main" java.io.IOException: org.apache.hbase.thirdparty.com.google.protobuf.ServiceException: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownRegionException): org.apache.hadoop.hbase.UnknownRegionException: Error trying to load region f0b4865fe8ea07321ed8eb237a592c10 from META
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.loadRegionFromMeta(AssignmentManager.java:1646)
at org.apache.hadoop.hbase.master.MasterRpcServices.getRegionInfo(MasterRpcServices.java:2581)
at org.apache.hadoop.hbase.master.MasterRpcServices.assigns(MasterRpcServices.java:2615)
at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=46, exceptions:
2023-01-23T10:34:38.453Z, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68451: org.apache.hadoop.hbase.NotServingRegionException: hbase:meta,,1 is not online on hadoop-datanode2,16020,1674468863385
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3462)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1
Created 01-23-2023 02:51 AM
@rki_ Getting this error after executing the command "hbase hbck -j jar assigns f0b4865fe8ea07321ed8eb237a592c"
2023-01-23 16:04:38,448 INFO [hconnection-0x6be3e1e2-shared-pool-1] client.RpcRetryingCallerImpl: Server.getRegionByEncodedName(HRegionServer.java:3462)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
, details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1, see https://s.apache.org/timeout
2023-01-23 16:05:34,990 WARN [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined.
Created 04-03-2023 09:44 PM
Hello @RammiSE
Your Post is being replied a bit late, yet I am posting a response anyways. Assuming your Team has resolved the Issue, Appreciate your Team sharing the details in the Post for wider audience.
For HMaster to be Initialised, "hbase:meta" & "hbase:namespace" Table Region needs to be Online. In your previous thread, the HMaster is reporting "hbase:meta" isn't Online [1]. As such, Use the HBCK2 JAR to assign the "hbase:meta" Region "1588230740" first & review (Via HBase UI) whether Regions are being assigned successfully. It's feasible the "hbase:namespace" Table Region would also reporting similar tracing, in which case your Team needs to use HBCK2 JAR to assign the "hbase:namespace" Region. Restarting HMaster after manually performing HBCK2 Assign isn't required always, yet the same won't harm as well.
Regards, Smarak
[1]
2023-01-23 16:05:34,990 WARN [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined