- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Production master not coming up
- Labels:
-
Apache HBase
Created ‎01-22-2023 08:14 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Team,
Prod master node is not coming up. Getting below error, could you pls tell me how to resolve the issue as the data is very important.
2023-01-23 09:40:50,748 ERROR [master/ctrlsu-hbaseRS1:16000:becomeActiveMaster] master.HMaster: Failed to become active master
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:379)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:319)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1324)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1055)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2184)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:519)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:249)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1322)
... 4 more
2023-01-23 09:40:50,749 ERROR [master/ctrlsu-hbaseRS1:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: []
2023-01-23 09:40:50,749 ERROR [master/ctrlsu-hbaseRS1:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master ctrlsu-hbasers1,16000,1674446742141: Unhandled exception. Starting shutdown. *****
java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:379)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:319)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1324)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1055)
at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2184)
at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:519)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)
at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)
at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:249)
at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1322)
Created ‎04-03-2023 09:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @RammiSE
Your Post is being replied a bit late, yet I am posting a response anyways. Assuming your Team has resolved the Issue, Appreciate your Team sharing the details in the Post for wider audience.
For HMaster to be Initialised, "hbase:meta" & "hbase:namespace" Table Region needs to be Online. In your previous thread, the HMaster is reporting "hbase:meta" isn't Online [1]. As such, Use the HBCK2 JAR to assign the "hbase:meta" Region "1588230740" first & review (Via HBase UI) whether Regions are being assigned successfully. It's feasible the "hbase:namespace" Table Region would also reporting similar tracing, in which case your Team needs to use HBCK2 JAR to assign the "hbase:namespace" Region. Restarting HMaster after manually performing HBCK2 Assign isn't required always, yet the same won't harm as well.
Regards, Smarak
[1]
2023-01-23 16:05:34,990 WARN [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined
Created ‎01-22-2023 11:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @RammiSE , Based on the exception, the hbase:namespace table is not online. You will need to assign the namespace region to bring up the Hbase Master.
https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_hbase_hbck.html
~~~
Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED
Created on ‎01-23-2023 12:08 AM - edited ‎01-23-2023 03:00 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@rki_ Getting this error after assigns
2023-01-23 16:04:18,310 INFO [hconnection-0x6be3e1e2-shared-pool-1] client.RpcRetryingCallerImpl: Server.getRegionByEncodedName(HRegionServer.java:3462)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
, details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1, see https://s.apache.org/timeout
Created ‎01-23-2023 12:18 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@RammiSE you will need to assign the respective namespace region ID by checking the Hbase Master log using the hbck2 jar
Created ‎01-23-2023 02:53 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@rki_ Getting this error after executing the command "hbase hbck -j .jar assigns f0b4865fe8ea07321ed8eb237a592c10"
2023-01-23 16:04:38,448 INFO [hconnection-0x6be3e1e2-shared-pool-1] client.RpcRetryingCallerImpl: Server.getRegionByEncodedName(HRegionServer.java:3462)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
, details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1, see https://s.apache.org/timeout
2023-01-23 16:05:34,990 WARN [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined.
Created ‎01-23-2023 05:55 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@RammiSE Try the below :
./hbase hbck -j /tmp/hbase-hbck2-1.2.0.jar assigns -o f0b4865fe8ea07321ed8eb237a592c10
Created ‎01-23-2023 02:36 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@rki_ I am executing this command
"hbase hbck -j /tmp/hbase-hbck2-1.2.0.jar assigns f0b4865fe8ea07321ed8eb237a592c"
and getting error . pls guide me the next steps
Exception in thread "main" java.io.IOException: org.apache.hbase.thirdparty.com.google.protobuf.ServiceException: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownRegionException): org.apache.hadoop.hbase.UnknownRegionException: Error trying to load region f0b4865fe8ea07321ed8eb237a592c10 from META
at org.apache.hadoop.hbase.master.assignment.AssignmentManager.loadRegionFromMeta(AssignmentManager.java:1646)
at org.apache.hadoop.hbase.master.MasterRpcServices.getRegionInfo(MasterRpcServices.java:2581)
at org.apache.hadoop.hbase.master.MasterRpcServices.assigns(MasterRpcServices.java:2615)
at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=46, exceptions:
2023-01-23T10:34:38.453Z, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68451: org.apache.hadoop.hbase.NotServingRegionException: hbase:meta,,1 is not online on hadoop-datanode2,16020,1674468863385
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3462)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1
Created ‎01-23-2023 02:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@rki_ Getting this error after executing the command "hbase hbck -j jar assigns f0b4865fe8ea07321ed8eb237a592c"
2023-01-23 16:04:38,448 INFO [hconnection-0x6be3e1e2-shared-pool-1] client.RpcRetryingCallerImpl: Server.getRegionByEncodedName(HRegionServer.java:3462)
at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)
at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)
at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)
at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)
, details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1, see https://s.apache.org/timeout
2023-01-23 16:05:34,990 WARN [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined.
Created ‎04-03-2023 09:44 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello @RammiSE
Your Post is being replied a bit late, yet I am posting a response anyways. Assuming your Team has resolved the Issue, Appreciate your Team sharing the details in the Post for wider audience.
For HMaster to be Initialised, "hbase:meta" & "hbase:namespace" Table Region needs to be Online. In your previous thread, the HMaster is reporting "hbase:meta" isn't Online [1]. As such, Use the HBCK2 JAR to assign the "hbase:meta" Region "1588230740" first & review (Via HBase UI) whether Regions are being assigned successfully. It's feasible the "hbase:namespace" Table Region would also reporting similar tracing, in which case your Team needs to use HBCK2 JAR to assign the "hbase:namespace" Region. Restarting HMaster after manually performing HBCK2 Assign isn't required always, yet the same won't harm as well.
Regards, Smarak
[1]
2023-01-23 16:05:34,990 WARN [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined
