Thank you @vpoornalingam
I have added the alert statement above. Hive and beeline is not working in command line. Attaching that error below:
It was hung after this statement.. nothing appeared after this.
Metastore on sandbox.hortonworks.com failed (Execution of 'ambari-sudo.sh su ambari-qa -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/bin:/usr/bin:/var/lib/ambari-agent:/bin/:/usr/bin/:/usr/sbin/:/usr/hdp/current/hive-metastore/bin'"'"' ; export HIVE_CONF_DIR='"'"'/usr/hdp/current/hive-metastore/conf/conf.server'"'"' ; hive --hiveconf hive.metastore.uris=thrift://sandbox.hortonworks.com:9083 --hiveconf hive.metastore.client.connect.retry.delay=1 --hiveconf hive.metastore.failure.retries=1 --hiveconf hive.metastore.connect.retries=1 --hiveconf hive.metastore.client.socket.timeout=14 --hiveconf hive.execution.engine=mr -e '"'"'show databases;'"'"''' was killed due timeout after 30 seconds)
Go to ambari -> Hive -> Services -> Restart all to restart hive server and hive metastore services.
You may need tor restart ambari service also. Go to your amabri server and in the command line, type:
> ambari-server restart
Make sure you restart under the account that you started ambari server (usually root).
Thank you @Ed Gleeck
1. I tried the command 'ambari-server restart' which resulted in heartbeat lost on ambari UI. I was logged in as Admin.
2. I restarted the ambari server from command line as a root user: which showed it was successful.
3. I had tried restarting all services of hive before doing this but that didnot clear the alert.
Please let me know if you need more information on the error. Thanks again!
On Beeline, can you pass the actual servername (sandbox.hortonworks.com) instead of 127.0.0.1
Please restart ambari-agent also
And also please let me know the output of these one:
hostname hostname -i cat /etc/hosts
It is a possibility you are not on a network OR something is blocking port 10000, 9083 on your network (such as VPN, local firewall etc..)
Restarting agent removed the hearbeat lost Error - Thank you. Alert still persists though. Here are the outputs:
sandbox.hortonworks.com:10000 also gave the same Error. You said these ports must be already in use(I tried netstat -nl | grep 10000 and for 9083 port - It showed tcp LISTEN ) so what can I do to overcome this issue. Thanking you in advance! @Manish Gupta
I am having the same issue, but hive service is up and running and also to able to connect hive via beeline. So any solution for removing the alert for hive metastore process.
The particular alert definition you cited references the Hive Metastore Process. This alert is triggered if the Hive Metastore process cannot be determined to be up and listening on the network. Here's how you can determine what the issue might be.
- On the node where Hive Metastore is running, review the following log: hivemetastore.log (should be in /var/log/hive/). This log will determine what the actual issue with the metastore process is. If the issue is still prevalent, you will see the error repeated if you tail -f the log.
In my case, the issue was with max_connections on the SQL instance which I had to Flush the cache for.
The exact error in the log was -- Failed to acquire connection to jdbc:mysql://HOSTNAME/Hive. Host IP is blocked blocked because of many connection errors.
Used the following command to flush the connections to MySQL backend for Hive:
- On the node where metastore process DB is hosted, type the following command:
mysqladmin -u root -p flush-hosts
NOTE: you will be prompted to type in the root pass for mysql.
As I mentioned, there could be several causes so determine the reason by checking the log first.