Member since
10-19-2017
21
Posts
4
Kudos Received
0
Solutions
06-11-2019
07:05 AM
Receiving Namenode High Availability Status transient alerts by saying that unknow host standby and unable to determine the active namenode. In logs before the error it is saying that "unable to extract JSON from JMX Response" from the script. The issue is automatically getting solved. we have defualt settings grace alert up to 5 seconds. Please help what is the exact reason for this. No more information in namenode logs just seeing these error in ambari-alerts and agent files.
... View more
- Tags:
- Ambari
- JMX
- namenode-ha
Labels:
- Labels:
-
Apache Ambari
-
Apache Hadoop
04-02-2018
11:13 AM
Yarn Queue Manager is not opening.yarn-queue-manager.png
... View more
Labels:
- Labels:
-
Apache YARN
03-01-2018
06:57 AM
Thanks for your comment.But after the activity ,HA configurations and cluster Restart the jobs were not starting in the master machine and entry automatically went wrong in the /etc/hosts and after some time i ran the jobs in the HA node after moving the master services to HA . now port also not listening to the IP address in the master.
... View more
02-28-2018
09:36 PM
I have completed the activity successfully. 1) First we configured Hbase Master -Hmaster and restarted Hbase services fully in the cluster. 2) Then configured Resource manager HA and verified the configuration stopping and starting of all the services. 3) Made down the master node after keeping in maintenance mode and verified the HA setup ,so HA namenode,RM and Hbase master acted as Active. 4) Server made down for master 5) Cache battery replaced 6) After making up system health was still degraded and we did some trouble shooting . while rebooting the server then press F10 intelligent provision- then click storage tab then actions then configure and finally enable the cache manager. 7) Verified the system health status is ok or not and amber light also gone. 😎 Made up the h2im-mas services and turned off the maintenance mode . 9) Also checked the failover again by stopping the HA services and now all services as active in the master and standby in HA. (note- we can also do the failover through command line )
... View more
02-27-2018
03:35 PM
1 Kudo
We are planning to replace the faulty cache battery on the master server.Currently, in our set up HA is configured but except for Resource manager HA & Hbase HA. Also planning to configure the RM & Hbase HA in the HA node during this window.what is the best way of doing the shutdown and startup of the master server. Attaching the master and HA server components.Please guide what is the best way to do this activity. master-server-components.pngha-node.png
... View more
02-27-2018
02:43 PM
1 Kudo
I solved this issue by enable some tables from the hbase like system catalog,sequence,function & stats
... View more
02-27-2018
11:09 AM
I solved this error by enabling system-catalog table .But now it is getting stopped after some time due to one error and that table is already present in that slave. org.apache.hadoop.hbase.NotServingRegionException: Region SYSTEM.CATALOG,,1472064039558.0e215adee75cb12a96b422b7a820da20. is not online
... View more
02-27-2018
09:53 AM
I have checked and all jars are same version.
... View more
02-25-2018
06:34 AM
1 Kudo
After Restarting also phoenix server is getting stopped .After restart it will work for some seconds and again getting stopped.Please help. logs:- INFO org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x15ca2ca52ba3d42
2018-02-25 11:15:54,847 FATAL org.apache.phoenix.queryserver.server.Main: Unrecoverable service error. Shutting down.
java.lang.RuntimeException: java.sql.SQLException: ERROR 2006 (INT08): Incompatible jars detected between client and server. Ensure that phoenix.jar is put on the classpath of HBase in every region server: SYSTEM.CATALOG is disabled.
at org.apache.phoenix.queryserver.server.PhoenixMetaFactoryImpl.create(PhoenixMetaFactoryImpl.java:73)
at org.apache.phoenix.queryserver.server.Main.run(Main.java:203)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
at org.apache.phoenix.queryserver.server.Main.main(Main.java:226)
Caused by: java.sql.SQLException: ERROR 2006 (INT08): Incompatible jars detected between client and server. Ensure that phoenix.jar is put on the classpath of HBase in every region server: SYSTEM.CATALOG is disabled.
at org.apache.phoenix.exception.SQLExceptionCode$Factory$1.newException(SQLExceptionCode.java:386)
at org.apache.phoenix.exception.SQLExceptionInfo.buildException(SQLExceptionInfo.java:145)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.checkClientServerCompatibility(ConnectionQueryServicesImpl.java:990)
at org.apache.phoenix.query.ConnectionQueryServicesImpl.ensureTableCreated(ConnectionQueryServicesImpl.java:869)
... View more
Labels:
- Labels:
-
Apache Phoenix
01-21-2018
07:11 AM
Active Hbase master goes down & and failover happened by making standby Hbase master as up.But after some time all the region servers goes down one by one and standby Hbase master also goes down and finally whole HBase cluster goes offline.We have started all the services make up. Active Hbase Master LOGS : - 2018-01-17 16:22:29,895 ERROR [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] master.ActiveMasterManager: master:16000-0x35ee3bbff600001, quorum=post-om2.vodafone.flytxt.com:2181,post-om1.vodafone.flytxt.com:2181,post-os1.vodafone.flytxt.com:2181, baseZNode=/hbase Error deleting our own master address node
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired for /hbase/master
at org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1155)
at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.getData(RecoverableZooKeeper.java:359)
at org.apache.hadoop.hbase.zookeeper.ZKUtil.getData(ZKUtil.java:745)
at org.apache.hadoop.hbase.zookeeper.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:148)
at org.apache.hadoop.hbase.master.ActiveMasterManager.stop(ActiveMasterManager.java:267)
at org.apache.hadoop.hbase.master.HMaster.stopServiceThreads(HMaster.java:1145)
at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1071)
at java.lang.Thread.run(Thread.java:744)
2018-01-17 16:22:29,895 INFO [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x15ee3bbf9910003
2018-01-17 16:22:29,897 INFO [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] zookeeper.ZooKeeper: Session: 0x15ee3bbf9910003 closed
2018-01-17 16:22:29,897 INFO [post-om2:16000.activeMasterManager-EventThread] zookeeper.ClientCnxn: EventThread shut down
2018-01-17 16:22:29,899 INFO [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] flush.MasterFlushTableProcedureManager: stop: server shutting down.
2018-01-17 16:22:29,899 INFO [master/post-om2.vodafone.flytxt.com/10.88.8.79:16000] ipc.RpcServer: Stopping server on 16000
2018-01-17 16:22:29,900 INFO [RpcServer.listener,port=16000] ipc.RpcServer: RpcServer.listener,port=16000: stopping Active /old Standby HBase Master logs : - 2018-01-17 16:23:06,025 INFO [post-om1:16000.activeMasterManager] master.ActiveMasterManager: Registered Active Master=post-om1.vodafone.flytxt.com,16000,1507059524710
2018-01-17 16:24:46,275 INFO [post-om1:16000.activeMasterManager] master.AssignmentManager: Joined the cluster in 119ms, failover=true
Post-os10 goes down logs in the post-om1:
2018-01-17 18:45:35,364 ERROR [PriorityRpcServer.handler=11,queue=1,port=16000] master.MasterRpcServices: Region server post-os10.vodafone.flytxt.com,16020,1507059587301 reported a fatal error:
ABORTING region server post-os10.vodafone.flytxt.com,16020,1507059587301: IOE in log roller
Cause:
java.io.IOException: cannot get log writer
at org.apache.hadoop.hbase.wal.DefaultWALProvider.createWriter(DefaultWALProvider.java:365)
at org.apache.hadoop.hbase.regionserver.wal.FSHLog.createWriterInstance(FSHLog.java:746)
at org.apache.hadoop.hbase.regionserver.wal.FSHLog.rollWriter(FSHLog.java:711)
at org.apache.hadoop.hbase.regionserver.LogRoller.run(LogRoller.java:137)
at java.lang.Thread.run(Thread.java:744)
Caused by: java.io.FileNotFoundException: Parent directory doesn't exist: /apps/hbase/data/WALs/post-os10.vodafone.flytxt.com,16020,1507059587301 2018-01-17 18:54:21,800 ERROR [PriorityRpcServer.handler=11,queue=1,port=16000] master.MasterRpcServices: Region server post-os5.vodafone.flytxt.com,16020,1513481333569 reported a fatal error:
ABORTING region server post-os5.vodafone.flytxt.com,16020,1513481333569: IOE in log roller
Cause:
java.io.IOException: cannot get log writer
Sample Region server logs : - 2018-01-17 18:52:10,311 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-SendThread(post-om1.vodafone.flytxt.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server post-om1.vodafone.flytxt.com/10.88.8.71:2181. Will not attempt to authenticate using SASL (unknown error)
2018-01-17 18:52:10,311 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-SendThread(post-om1.vodafone.flytxt.com:2181)] zookeeper.ClientCnxn: Socket connection established to post-om1.vodafone.flytxt.com/10.88.8.71:2181, initiating session
2018-01-17 18:52:10,313 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-SendThread(post-om1.vodafone.flytxt.com:2181)] zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x15ee3bbf9933b39 has expired, closing socket connection
2018-01-17 18:52:10,313 WARN [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-EventThread] client.ConnectionManager$HConnectionImplementation: This client just lost it's session with ZooKeeper, closing it. It will be recreated next time someone needs it
org.apache.zookeeper.KeeperException$SessionExpiredException: KeeperErrorCode = Session expired
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.connectionEvent(ZooKeeperWatcher.java:606)
at org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:517)
at org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:522)
at org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:498)
2018-01-17 18:52:10,313 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-EventThread] client.ConnectionManager$HConnectionImplementation: Closing zookeeper sessionid=0x15ee3bbf9933b39
2018-01-17 18:52:10,313 INFO [regionserver/post-os5.vodafone.flytxt.com/10.88.8.76:16020-EventThread] zookeeper.ClientCnxn: EventThread shut down
2018-01-17 18:52:10,335 INFO [main-SendThread(post-om2.vodafone.flytxt.com:2181)] zookeeper.ClientCnxn: Opening socket connection to server post-om2.vodafone.flytxt.com/10.88.8.79:2181. Will not attempt to authenticate using SASL (unknown error)
... View more
- Tags:
- Data Processing
- HBase
Labels:
- Labels:
-
Apache HBase
01-06-2018
02:57 PM
Thanks jay sharma. But if there is any documents which look likes a professional hadoop cluster in a corporate environment with majority of Important services like NN,DN,zookeeper,Hbase,Hive,oozie,sqoop like .. How to assign RAM,storage...
... View more
01-05-2018
01:01 PM
1 Kudo
Any documents like Hadoop cluster planning mode like pro with the important ecosystems & services
... View more
- Tags:
- planning
01-05-2018
12:57 PM
In master machines ambari server and ambari agent needed with ambari agent on data nodes.But once active namenode fails and standby acts as a active then how ambari server will work in standby.How ambari-server and agent will communicate
... View more
Labels:
- Labels:
-
Apache Ambari