Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HBase Master: Failed to become active master

avatar
Contributor

HBase Master cannot start up!!

Here is the error (trace):

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

2015-05-08 09:31:02,675 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.library.path=/opt/cloudera/parcels/CDH-5.4.0-1.cdh5.4.0.p0.27/lib/hadoop/lib/native
2015-05-08 09:31:02,675 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.io.tmpdir=/tmp
2015-05-08 09:31:02,675 INFO org.apache.zookeeper.ZooKeeper: Client environment:java.compiler=<NA>
2015-05-08 09:31:02,675 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.name=Linux
2015-05-08 09:31:02,675 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.arch=amd64
2015-05-08 09:31:02,675 INFO org.apache.zookeeper.ZooKeeper: Client environment:os.version=2.6.32-431.el6.x86_64
2015-05-08 09:31:02,676 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.name=hbase
2015-05-08 09:31:02,676 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.home=/var/lib/hbase
2015-05-08 09:31:02,676 INFO org.apache.zookeeper.ZooKeeper: Client environment:user.dir=/var/run/cloudera-scm-agent/process/1831-hbase-MASTER
2015-05-08 09:31:02,676 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=node1:2181,master:2181,node2:2181 sessionTimeout=60000 watcher=master:600000x0, quorum=node1:2181,master:2181,node2:2181, baseZNode=/hbase
2015-05-08 09:31:02,687 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node1/10.15.230.41:2181. Will not attempt to authenticate using SASL (unknown error)
2015-05-08 09:31:02,690 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to node1/10.15.230.41:2181, initiating session
2015-05-08 09:31:02,695 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server node1/10.15.230.41:2181, sessionid = 0x24d3137ed890570, negotiated timeout = 60000
2015-05-08 09:31:02,729 INFO org.apache.hadoop.hbase.ipc.RpcServer: RpcServer.responder: starting
2015-05-08 09:31:02,729 INFO org.apache.hadoop.hbase.ipc.RpcServer: RpcServer.listener,port=60000: starting
2015-05-08 09:31:02,776 INFO org.mortbay.log: Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog
2015-05-08 09:31:02,780 INFO org.apache.hadoop.hbase.http.HttpRequestLog: Http request log for http.requests.master is not defined
2015-05-08 09:31:02,788 INFO org.apache.hadoop.hbase.http.HttpServer: Added global filter 'safety' (class=org.apache.hadoop.hbase.http.HttpServer$QuotingInputFilter)
2015-05-08 09:31:02,790 INFO org.apache.hadoop.hbase.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context master
2015-05-08 09:31:02,790 INFO org.apache.hadoop.hbase.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context static
2015-05-08 09:31:02,791 INFO org.apache.hadoop.hbase.http.HttpServer: Added filter static_user_filter (class=org.apache.hadoop.hbase.http.lib.StaticUserWebFilter$StaticUserFilter) to context logs
2015-05-08 09:31:02,803 INFO org.apache.hadoop.hbase.http.HttpServer: Jetty bound to port 60010
2015-05-08 09:31:02,803 INFO org.mortbay.log: jetty-6.1.26.cloudera.4
2015-05-08 09:31:03,061 INFO org.mortbay.log: Started SelectChannelConnector@0.0.0.0:60010
2015-05-08 09:31:03,063 INFO org.apache.hadoop.hbase.master.HMaster: hbase.rootdir=hdfs://master:8020/hbase, hbase.cluster.distributed=true
2015-05-08 09:31:03,073 INFO org.apache.hadoop.hbase.master.HMaster: Adding backup master ZNode /hbase/backup-masters/master,60000,1431091861873
2015-05-08 09:31:03,142 INFO org.apache.hadoop.hbase.master.ActiveMasterManager: Deleting ZNode for /hbase/backup-masters/master,60000,1431091861873 from backup master directory
2015-05-08 09:31:03,149 INFO org.apache.hadoop.hbase.master.ActiveMasterManager: Registered Active Master=master,60000,1431091861873
2015-05-08 09:31:03,152 INFO org.apache.hadoop.conf.Configuration.deprecation: fs.default.name is deprecated. Instead, use fs.defaultFS
2015-05-08 09:31:03,161 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x348d62d4 connecting to ZooKeeper ensemble=node1:2181,master:2181,node2:2181
2015-05-08 09:31:03,162 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=node1:2181,master:2181,node2:2181 sessionTimeout=60000 watcher=hconnection-0x348d62d40x0, quorum=node1.net:2181,master:2181,node2:2181, baseZNode=/hbase
2015-05-08 09:31:03,162 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node2/10.15.230.42:2181. Will not attempt to authenticate using SASL (unknown error)
2015-05-08 09:31:03,163 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to node2/10.15.230.42:2181, initiating session
2015-05-08 09:31:03,164 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server node2/10.15.230.42:2181, sessionid = 0x34d3137ea45057a, negotiated timeout = 60000
2015-05-08 09:31:03,184 INFO org.apache.hadoop.hbase.regionserver.HRegionServer: ClusterId : d72a7eb0-8dba-485b-a8bc-2fbf5a182ed7
2015-05-08 09:31:03,365 INFO org.apache.hadoop.hbase.fs.HFileSystem: Added intercepting call to namenode#getBlockLocations so can do block reordering using class class org.apache.hadoop.hbase.fs.HFileSystem$ReorderWALBlocks
2015-05-08 09:31:03,370 INFO org.apache.hadoop.hbase.coordination.SplitLogManagerCoordination: Found 0 orphan tasks and 0 rescan nodes
2015-05-08 09:31:03,383 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=hconnection-0x7ae2a774 connecting to ZooKeeper ensemble=node1:2181,master:2181,node2:2181
2015-05-08 09:31:03,383 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=node1:2181,master:2181,node2:2181 sessionTimeout=60000 watcher=hconnection-0x7ae2a7740x0, quorum=node1:2181,master:2181,node2:2181, baseZNode=/hbase
2015-05-08 09:31:03,385 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node1/10.15.230.41:2181. Will not attempt to authenticate using SASL (unknown error)
2015-05-08 09:31:03,385 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to node1/10.15.230.41:2181, initiating session
2015-05-08 09:31:03,386 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server node1/10.15.230.41:2181, sessionid = 0x24d3137ed890571, negotiated timeout = 60000
2015-05-08 09:31:03,398 INFO org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: loading config
2015-05-08 09:31:03,439 INFO org.apache.hadoop.hbase.master.HMaster: Server active/primary master=master,60000,1431091861873, sessionid=0x24d3137ed890570, setting cluster-up flag (Was=true)
2015-05-08 09:31:03,452 INFO org.apache.hadoop.hbase.procedure.ZKProcedureUtil: Clearing all procedure znodes: /hbase/online-snapshot/acquired /hbase/online-snapshot/reached /hbase/online-snapshot/abort
2015-05-08 09:31:03,458 INFO org.apache.hadoop.hbase.procedure.ZKProcedureUtil: Clearing all procedure znodes: /hbase/flush-table-proc/acquired /hbase/flush-table-proc/reached /hbase/flush-table-proc/abort
2015-05-08 09:31:03,485 INFO org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper: Process identifier=replicationLogCleaner connecting to ZooKeeper ensemble=node1:2181,master:2181,node2:2181
2015-05-08 09:31:03,485 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=node1:2181,master:2181,node2:2181 sessionTimeout=60000 watcher=replicationLogCleaner0x0, quorum=node1:2181,master:2181,node2:2181, baseZNode=/hbase
2015-05-08 09:31:03,486 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server node1/10.15.230.41:2181. Will not attempt to authenticate using SASL (unknown error)
2015-05-08 09:31:03,486 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to node1/10.15.230.41:2181, initiating session
2015-05-08 09:31:03,487 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server node1/10.15.230.41:2181, sessionid = 0x24d3137ed890572, negotiated timeout = 60000
2015-05-08 09:31:03,495 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 0 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2015-05-08 09:31:04,998 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 1503 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2015-05-08 09:31:06,300 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=node2,60020,1431091828858
2015-05-08 09:31:06,300 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=node3,60020,1431091829281
2015-05-08 09:31:06,300 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=node5,60020,1431091828696
2015-05-08 09:31:06,301 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=node4,60020,1431091828684
2015-05-08 09:31:06,301 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=node1,60020,1431091828790
2015-05-08 09:31:06,301 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 4, slept for 2806 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2015-05-08 09:31:06,352 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 5, slept for 2857 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2015-05-08 09:31:07,855 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 5, slept for 4360 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms.
2015-05-08 09:31:08,006 INFO org.apache.hadoop.hbase.master.ServerManager: Finished waiting for region servers count to settle; checked in 5, slept for 4511 ms, expecting minimum of 1, maximum of 2147483647, master is running
2015-05-08 09:31:08,012 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://master:8020/hbase/WALs/node1,60020,1431091828790 belongs to an existing region server
2015-05-08 09:31:08,013 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://master:8020/hbase/WALs/node2,60020,1431091828858 belongs to an existing region server
2015-05-08 09:31:08,015 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://master:8020/hbase/WALs/node3,60020,1431091829281 belongs to an existing region server
2015-05-08 09:31:08,016 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://master:8020/hbase/WALs/node4,60020,1431091828684 belongs to an existing region server
2015-05-08 09:31:08,017 INFO org.apache.hadoop.hbase.master.MasterFileSystem: Log folder hdfs://master:8020/hbase/WALs/node5,60020,1431091828696 belongs to an existing region server
2015-05-08 09:31:08,081 INFO org.apache.hadoop.hbase.master.RegionStates: Transition {1588230740 state=OFFLINE, ts=1431091868026, server=null} to {1588230740 state=OPEN, ts=1431091868081, server=node3,60020,1431091829281}
2015-05-08 09:31:08,083 INFO org.apache.hadoop.hbase.master.ServerManager: AssignmentManager hasn't finished failover cleanup; waiting
2015-05-08 09:31:08,084 INFO org.apache.hadoop.hbase.master.HMaster: hbase:meta assigned=0, rit=false, location=node3,60020,1431091829281
2015-05-08 09:31:08,128 INFO org.apache.hadoop.hbase.MetaMigrationConvertingToPB: hbase:meta doesn't have any entries to update.
2015-05-08 09:31:08,128 INFO org.apache.hadoop.hbase.MetaMigrationConvertingToPB: META already up-to date with PB serialization
2015-05-08 09:31:08,145 INFO org.apache.hadoop.hbase.master.AssignmentManager: Clean cluster startup. Assigning user regions
2015-05-08 09:31:08,150 INFO org.apache.hadoop.hbase.master.AssignmentManager: Joined the cluster in 22ms, failover=false
2015-05-08 09:31:08,161 INFO org.apache.hadoop.hbase.master.TableNamespaceManager: Namespace table not found. Creating...
2015-05-08 09:31:08,189 FATAL org.apache.hadoop.hbase.master.HMaster: Failed to become active master
org.apache.hadoop.hbase.TableExistsException: hbase:namespace
        at org.apache.hadoop.hbase.master.handler.CreateTableHandler.checkAndSetEnablingTable(CreateTableHandler.java:152)
        at org.apache.hadoop.hbase.master.handler.CreateTableHandler.prepare(CreateTableHandler.java:125)
        at org.apache.hadoop.hbase.master.TableNamespaceManager.createNamespaceTable(TableNamespaceManager.java:233)
        at org.apache.hadoop.hbase.master.TableNamespaceManager.start(TableNamespaceManager.java:86)
        at org.apache.hadoop.hbase.master.HMaster.initNamespace(HMaster.java:897)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:739)
        at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:169)
        at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1469)
        at java.lang.Thread.run(Thread.java:745)
2015-05-08 09:31:08,195 FATAL org.apache.hadoop.hbase.master.HMaster: Master server abort: loaded coprocessors are: []
2015-05-08 09:31:08,195 FATAL org.apache.hadoop.hbase.master.HMaster: Unhandled exception. Starting shutdown.

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>

1 ACCEPTED SOLUTION

avatar
Rising Star

So ZK is working fine. Perfect.

 

Make sure HBase is down and clear everything:

hadoop fs -rm -r /hbase/*

echo "rmr /hbase" | zookeeper-client

 

Then try to start HBase.

 

JMS

View solution in original post

29 REPLIES 29

avatar
Rising Star

Hi Sandeep,

 

This command will remove ALL HBase data... Snapshots, tables, WALs, etc. So be very careful with it.

 

JMS

avatar
New Contributor

so when the above command delete everything i will lost all HBASE tables/WAL's. Is the above steps recommened in Production environment?

avatar
Rising Star

It depends what you want to achieve, right? If you goal is to repair your production cluster, then it might be relevant. If you goal is just to test and see  what it does, then what ever command it is, it should not be tested in production?

 

Basically, the command just does what I described. It deletes everything. It's you to decide if it's relevent for your production environment or not. If you goal is just to drop a table, or all tables, then the shell is better.

 

Are you facing any issue with your production environment?

avatar
New Contributor

No...we don't have any issues so far with HBASE. I am just trying to understand the concept of it.

 

Thanks for your assistance 🙂

avatar
Rising Star

Hello,

 

If the data was present for the same issue, then what is the best way of sollution ?

Can you please help me.

 

Thanks

avatar
Rising Star

Hi Vinod,

 

Can you please start a different thread and share your Master logs with us?

 

JMS

avatar
New Contributor

This Helped a lot of time and effort.. Thanks a lot..

avatar
Contributor

Hi,

 

Can anyone help me with this, getting this error and HBase master is getting after every successfull restart.

 

Could not obtain block: BP-892504517-172.31.16.10-1537889422648:blk_1073741826_1002 file=/hbase/hbase.version No live nodes contain current block Block locations: Dead nodes: . Throwing a BlockMissingException

at org.apache.hadoop.hdfs.DFSInputStream.refetchLocations(DFSInputStream.java:1040)         at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1023)         at org.apache.hadoop.hdfs.DFSInputStream.chooseDataNode(DFSInputStream.java:1002)         at org.apache.hadoop.hdfs.DFSInputStream.blockSeekTo(DFSInputStream.java:642)         at org.apache.hadoop.hdfs.DFSInputStream.readWithStrategy(DFSInputStream.java:895)         at org.apache.hadoop.hdfs.DFSInputStream.read(DFSInputStream.java:954)         at java.io.DataInputStream.read(DataInputStream.java:149)         at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)         at org.apache.hadoop.hbase.util.FSUtils.getVersion(FSUtils.java:606)         at org.apache.hadoop.hbase.util.FSUtils.checkVersion(FSUtils.java:689)         at org.apache.hadoop.hbase.master.MasterFileSystem.checkRootDir(MasterFileSystem.java:500)         at org.apache.hadoop.hbase.master.MasterFileSystem.createInitialFileSystemLayout(MasterFileSystem.java:169)         at org.apache.hadoop.hbase.master.MasterFileSystem.<init>(MasterFileSystem.java:144)         at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:704)         at org.apache.hadoop.hbase.master.HMaster.access$500(HMaster.java:194)         at org.apache.hadoop.hbase.master.HMaster$1.run(HMaster.java:1834)         at java.lang.Thread.run(Thread.java:748)

avatar
Rising Star

Hi Ayush,

 

Have you tied to run hdfs fsdk / ?

 

JM

avatar
Contributor

Hi jmspaggi,

 

Thanks alot for your reply.

 

I was able to resolve the issue by reinstalling Hbase alone.

 

Regards

Ayush