Created 12-03-2017 03:48 PM
2017-12-03 00:52:44,048 FATAL [hostname:60000.activeMasterManager] mas ter.HMaster: Failed to become active master
java.io.IOException: Timedout 300000ms waiting for namespace table to be ass igned
at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableN amespaceManager.java:104)
at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java :1061)
at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitiali zation(HMaster.java:840)
at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:21
ERROR [hostname,60000,1512283354427_ChoreService_1] master.BackupLogCleaner: Failed to get hbase:backup table,therefore will keep all files
Created 12-03-2017 04:07 PM
Can anyone help ? We are getting this error while starting hbase .
Created 12-03-2017 04:24 PM
1) Are you region servers started ?
2) Can you please run 'hbase hbck' and see if there are any INCONSISTENCIES. Try running below command if you have INCONSISTENCIES and try restarting if the below command fixed the INCONSISTENCIES
hbase hbck -fixAssignments -fixMeta -fixHdfsHoles
3) Can you please attach complete hbase master logs.
Also, attach logs for #2 if there are INCONSISTENCIES detected.
Thanks,
Aditya
Created 12-03-2017 04:47 PM
Responses to your queries :
1. Region server is UP. Master is down.
2. We cannot run 'hbase hbck' since, the master is down.
3. Attaching hbase logs.
Created 12-03-2017 04:56 PM
Can you please attach the logs. I dont see the logs
Created 12-04-2017 06:00 AM
Below is the root cause for the issue from the log file given:
2017-12-02 21:29:59,054 ERROR [MASTER_SERVER_OPERATIONS-am1plccmrhdn01:60000-2] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN java.io.IOException: failed log splitting for am1plccmrhdd17.r1-core.r1.hostname.net,16020,1511029796297, will retry at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:357) at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:220) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) Caused by: java.io.IOException: error or interrupted while splitting logs in [hdfs://prod-gfat/apps/hbase/data/WALs/am1plccmrhdd17.r1-core.r1.hostname.net,16020,1511029796297-splitting] Task = installed = 1 done = 0 error = 0 at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:290) at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:393) at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:366) at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:288) at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:213) ... 4 more 2017-12-02 21:29:59,055 DEBUG [MASTER_SERVER_OPERATIONS-am1plccmrhdn01:60000-1] master.DeadServer: Finished processing am1plccmrhdd06.r1-core.r1.hostname.net,16020,1511029561545 2017-12-02 21:29:59,055 ERROR [MASTER_SERVER_OPERATIONS-am1plccmrhdn01:60000-4] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN java.io.IOException: Server is stopped at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:194) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744) 2017-12-02 21:29:59,055 ERROR [MASTER_SERVER_OPERATIONS-am1plccmrhdn01:60000-1] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN java.io.IOException: failed log splitting for am1plccmrhdd06.r1-core.r1.hostname.net,16020,1511029561545, will retry at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:357) at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:220) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:744)
This means that there are some region which are stuck during the splitting process which are not letting HBASE Master to come up due to WALs inconsistency.
to resolve the issue look for splitting files from (under HDFS): /apps/hbase/data/WALs/
you should be able to see: hdfs://prod-gfat/apps/hbase/data/WALs/am1plccmrhdd17.r1-core.r1.hostname.net,16020,1511029796297-splitting
You can either move this file or remove this file from this location.
Hope this helps.
Thanks
Venkat