Created 01-09-2018 07:11 AM
I am getting critical alert for spark thrift server.
alert description : This host-level alert is triggered if the Spark Thrift Server cannot be determined to be up.
Connection failed on host (Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/SPARK/1.2.1/package/scripts/alerts/alert_spark_thrift_port.py", line 143, in execute Execute(cmd, user=hiveruser, path=[beeline_cmd], timeout=CHECK_COMMAND_TIMEOUT_DEFAULT) File "/usr/lib/python2.6/site-packages/resource_management/core/...
I also did check log in the location /var/log/spark/org.apache.spark.sql.hive.thriftserver.HiveThriftServer2-1-host.out
but I could not see any Error there ,
Note : Even tough there is alert for spark thrift server but Spark thrift server is alive and running ? I do not understand why i am getting this critical alert for spark thrift server ?
Created 01-09-2018 07:17 AM
Can you please share the complete stackTrace of the error that you are getting :
Following is incomplete:
Connection failed on host (Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/SPARK/1.2.1/package/scripts/alerts/alert_spark_thrift_port.py", line 143, in execute Execute(cmd, user=hiveruser, path=[beeline_cmd], timeout=CHECK_COMMAND_TIMEOUT_DEFAULT) File "/usr/lib/python2.6/site-packages/resource_management/core/...
.
Created 01-09-2018 07:41 AM
Hi jay can you suggest some rest api or other way to get complete stack trace, I have mentioned these details just clicking on alert in ambari , this much detail I can see there when i hover mouse over these lines on black pop-up screen appears but it is difficult to provide details form there ?
Created 01-09-2018 08:19 AM
i can see below line in strack trace :
jdbc:hive2://hostname:10000/;principal=hive/_HOST@PRINCIPAL.COM """ transparentMode = binary -e === 2&1|awk """{print}"""|grep -i -e """connection refused""" -e ""Invalid URL"""
is this something creating problem ?
Created 01-09-2018 09:16 AM
please find below full stack trace :
Connection failed on host Host_name:10015 (Traceback (most recent call last): File "/var/lib/ambari-agent/cache/common-services/SPARK/1.2.1/package/scripts/alerts/alert_spark_thrift_port.py", line 143, in execute Execute(cmd, user=hiveruser, path=[beeline_cmd], timeout=CHECK_COMMAND_TIMEOUT_DEFAULT) File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 155, in __init__ self.env.run() File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 262, in action_run tries=self.resource.tries, try_sleep=self.resource.try_sleep) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/python2.6/site-packages/resource_management/core/shell.py", line 297, in _call raise ExecuteTimeoutException(err_msg) ExecuteTimeoutException: Execution of 'ambari-sudo.sh su hive -l -s /bin/bash -c 'export PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/dell/srvadmin/bin:/usr/lib/jvm/java-1.8.0-openjdk/bin:/home/hdpadmin/.local/bin:/home/hdpadmin/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/spark-client/bin/beeline'"'"' ; ! beeline -u '"'"'jdbc:hive2://host.com:10015/default;principal=hive/Host'"'"' transportMode=binary -e '"'"''"'"' 2>&1| awk '"'"'{print}'"'"'|grep -i -e '"'"'Connection refused'"'"' -e '"'"'Invalid URL'"'"''' was killed due timeout after 60.0 seconds
Created 04-16-2018 08:27 AM
Any news about this post, we met the same problem recently and our spark Thrift server is not running because of it. Ambari indicate it as green but the red alert show us that we can't connect on it using beeline. Spark2 thrift server is working instead.
Created 01-06-2019 04:05 PM
I would advice you to open a new thread because the one you are responding to date to January 2018.
Conseil d'ami