Support Questions

Find answers, ask questions, and share your expertise

Production master not coming up

New Contributor

Hi Team, 

 

Prod master node is not coming up. Getting below error, could you pls tell me how to resolve the issue as the data is very important. 

 

2023-01-23 09:40:50,748 ERROR [master/ctrlsu-hbaseRS1:16000:becomeActiveMaster] master.HMaster: Failed to become active master

java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED

at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:379)

at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:319)

at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1324)

at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1055)

at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2184)

at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:519)

at java.base/java.lang.Thread.run(Thread.java:829)

Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED

at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)

at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)

at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:249)

at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1322)

... 4 more

2023-01-23 09:40:50,749 ERROR [master/ctrlsu-hbaseRS1:16000:becomeActiveMaster] master.HMaster: Master server abort: loaded coprocessors are: []

2023-01-23 09:40:50,749 ERROR [master/ctrlsu-hbaseRS1:16000:becomeActiveMaster] master.HMaster: ***** ABORTING master ctrlsu-hbasers1,16000,1674446742141: Unhandled exception. Starting shutdown. *****

java.lang.IllegalStateException: Expected the service ClusterSchemaServiceImpl [FAILED] to be RUNNING, but the service has FAILED

at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.checkCurrentState(AbstractService.java:379)

at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.awaitRunning(AbstractService.java:319)

at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1324)

at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:1055)

at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2184)

at org.apache.hadoop.hbase.master.HMaster.lambda$run$0(HMaster.java:519)

at java.base/java.lang.Thread.run(Thread.java:829)

Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED

at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:107)

at org.apache.hadoop.hbase.master.ClusterSchemaServiceImpl.doStart(ClusterSchemaServiceImpl.java:63)

at org.apache.hbase.thirdparty.com.google.common.util.concurrent.AbstractService.startAsync(AbstractService.java:249)

at org.apache.hadoop.hbase.master.HMaster.initClusterSchemaService(HMaster.java:1322)

7 REPLIES 7

Expert Contributor

Hi @RammiSE , Based on the exception, the hbase:namespace table is not online. You will need to assign the namespace region to bring up the Hbase Master.

 

https://docs.cloudera.com/documentation/enterprise/6/6.3/topics/admin_hbase_hbck.html

 

~~~

Caused by: java.io.IOException: Timedout 300000ms waiting for namespace table to be assigned and enabled: tableName=hbase:namespace, state=ENABLED

New Contributor

@rki_ Getting this error after assigns 

 

2023-01-23 16:04:18,310 INFO  [hconnection-0x6be3e1e2-shared-pool-1] client.RpcRetryingCallerImpl: Server.getRegionByEncodedName(HRegionServer.java:3462)

at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)

at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)

, details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1, see https://s.apache.org/timeout

Expert Contributor

@RammiSE you will need to assign the respective namespace region ID by checking the Hbase Master log using the hbck2 jar

New Contributor

@rki_ Getting this error after executing the command "hbase hbck -j .jar assigns f0b4865fe8ea07321ed8eb237a592c10" 

 

2023-01-23 16:04:38,448 INFO  [hconnection-0x6be3e1e2-shared-pool-1] client.RpcRetryingCallerImpl: Server.getRegionByEncodedName(HRegionServer.java:3462)

at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)

at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)

, details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1, see https://s.apache.org/timeout

2023-01-23 16:05:34,990 WARN  [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined.

Expert Contributor

@RammiSE Try the below :

 

./hbase hbck -j /tmp/hbase-hbck2-1.2.0.jar assigns -o f0b4865fe8ea07321ed8eb237a592c10

New Contributor

@rki_ I am executing this command

"hbase hbck -j /tmp/hbase-hbck2-1.2.0.jar assigns f0b4865fe8ea07321ed8eb237a592c"

and getting error . pls guide me the next steps

 

Exception in thread "main" java.io.IOException: org.apache.hbase.thirdparty.com.google.protobuf.ServiceException: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.UnknownRegionException): org.apache.hadoop.hbase.UnknownRegionException: Error trying to load region f0b4865fe8ea07321ed8eb237a592c10 from META

at org.apache.hadoop.hbase.master.assignment.AssignmentManager.loadRegionFromMeta(AssignmentManager.java:1646)

at org.apache.hadoop.hbase.master.MasterRpcServices.getRegionInfo(MasterRpcServices.java:2581)

at org.apache.hadoop.hbase.master.MasterRpcServices.assigns(MasterRpcServices.java:2615)

at org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$2.callBlockingMethod(MasterProtos.java)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)

Caused by: org.apache.hadoop.hbase.client.RetriesExhaustedException: Failed after attempts=46, exceptions:

2023-01-23T10:34:38.453Z, java.net.SocketTimeoutException: callTimeout=60000, callDuration=68451: org.apache.hadoop.hbase.NotServingRegionException: hbase:meta,,1 is not online on hadoop-datanode2,16020,1674468863385

at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegionByEncodedName(HRegionServer.java:3462)

at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)

at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)

row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1

 

New Contributor

@rki_ Getting this error after executing the command "hbase hbck -j jar assigns f0b4865fe8ea07321ed8eb237a592c" 

 

2023-01-23 16:04:38,448 INFO  [hconnection-0x6be3e1e2-shared-pool-1] client.RpcRetryingCallerImpl: Server.getRegionByEncodedName(HRegionServer.java:3462)

at org.apache.hadoop.hbase.regionserver.HRegionServer.getRegion(HRegionServer.java:3439)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.getRegion(RSRpcServices.java:1488)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.newRegionScanner(RSRpcServices.java:3182)

at org.apache.hadoop.hbase.regionserver.RSRpcServices.scan(RSRpcServices.java:3558)

at org.apache.hadoop.hbase.shaded.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:45819)

at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:392)

at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:133)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:359)

at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:339)

, details=row '' on table 'hbase:meta' at region=hbase:meta,,1.1588230740, hostname=hadoop-datanode2,16020,1674362337687, seqNum=-1, see https://s.apache.org/timeout

2023-01-23 16:05:34,990 WARN  [master/ctrlsu-hbaseMS:16000:becomeActiveMaster] master.HMaster: hbase:meta,,1.1588230740 is NOT online; state={1588230740 state=OPEN, ts=1674468867063, server=hadoop-datanode2,16020,1674362337687}; ServerCrashProcedures=true. Master startup cannot progress, in holding-pattern until region onlined.