Reply
Explorer
Posts: 9
Registered: ‎07-10-2016

SOLR_CORE_STATUS_COLLECTION_HEALTH has become bad

Getting continous alerts from solr server nodes . 

SOLR_CORE_STATUS_COLLECTION_HEALTH has become bad: The Cloudera Manager Agent is not able to communicate with this Solr Server over the HTTP API.

 

Checked Cloudera agent logs and found below error message.

 


[04/Jan/2017 22:58:57 +0000] 124304 Monitor-SolrServerMonitor throttling_logger ERROR (59 skipped) Error fetching Solr core status at 'http://Hostname:8983/solr//admin/cores?wt=json&action=STATUS'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/src/cmf/monitor/solrserver/__init__.py", line 349, in _collect_core_status_from_url
openedUrl = self.urlopen(url, username=username, password=password)
File "/usr/lib64/cmf/agent/src/cmf/monitor/abstract_monitor.py", line 368, in urlopen
self._agent.safety_valve))
File "/usr/lib64/cmf/agent/src/cmf/url_util.py", line 166, in urlopen_with_retry_on_authentication_errors
return function()
File "/usr/lib64/cmf/agent/src/cmf/monitor/abstract_monitor.py", line 364, in <lambda>
password=password),
File "/usr/lib64/cmf/agent/src/cmf/url_util.py", line 66, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.6/urllib2.py", line 397, in open
response = meth(req, response)
File "/usr/lib64/python2.6/urllib2.py", line 510, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python2.6/urllib2.py", line 435, in error
return self._call_chain(*args)
File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib64/python2.6/urllib2.py", line 518, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
HTTPError: HTTP Error 500: Internal Server Error

 

Please advise.

 

 

Cloudera Employee
Posts: 17
Registered: ‎01-12-2017

Re: SOLR_CORE_STATUS_COLLECTION_HEALTH has become bad

Can you manually try access the endpoint of 'http://{Hostname}:8983/solr//admin/cores?wt=json&action=STATUS' in a browser to see if it returns response properly?

 

Does it happen for multiple SOLR Server? And which version of CM/SOLR is it?

Explorer
Posts: 9
Registered: ‎07-10-2016

Re: SOLR_CORE_STATUS_COLLECTION_HEALTH has become bad

[ Edited ]

Yes its happening to multiple solr severs everytime.

 

I checked the status of solr server with below url

 

http://{host_name}:8983/solr//admin/cores?wt=json&action=STATUS 

 

Error Message:

{"responseHeader":{"status":500,"QTime":25},"defaultCoreName":"collection1","error":{"msg":"Error handling 'status' action ","trace":"org.apache.solr.common.SolrException: Error handling 'status' action \n\tat org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:711)\n\tat org.apache.solr.handler.admin.CoreAdminHandler.handleRequestInternal(CoreAdminHandler.java:215)\n\tat org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:189)\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)\n\tat org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:770)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:262)\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:211)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)\n\tat org.apache.solr.servlet.SolrHadoopAuthenticationFilter$2.doFilter(SolrHadoopAuthenticationFilter.java:353)\n\tat org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:592)\n\tat org.apache.hadoop.security.token.delegation.web.DelegationTokenAuthenticationFilter.doFilter(DelegationTokenAuthenticationFilter.java:291)\n\tat org.apache.hadoop.security.authentication.server.AuthenticationFilter.doFilter(AuthenticationFilter.java:555)\n\tat org.apache.solr.servlet.SolrHadoopAuthenticationFilter.doFilter(SolrHadoopAuthenticationFilter.java:358)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)\n\tat org.apache.solr.servlet.HostnameFilter.doFilter(HostnameFilter.java:86)\n\tat org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)\n\tat org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)\n\tat org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)\n\tat org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)\n\tat org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)\n\tat org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)\n\tat org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)\n\tat org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:293)\n\tat org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:861)\n\tat org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:620)\n\tat org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:489)\n\tat java.lang.Thread.run(Thread.java:745)\nCaused by: org.apache.solr.common.SolrException: Error checking if hdfs path exists\n\tat org.apache.solr.core.HdfsDirectoryFactory.size(HdfsDirectoryFactory.java:348)\n\tat org.apache.solr.core.HdfsDirectoryFactory.size(HdfsDirectoryFactory.java:330)\n\tat org.apache.solr.handler.admin.CoreAdminHandler.getIndexSize(CoreAdminHandler.java:1166)\n\tat org.apache.solr.handler.admin.CoreAdminHandler.getCoreStatus(CoreAdminHandler.java:1141)\n\tat org.apache.solr.handler.admin.CoreAdminHandler.handleStatusAction(CoreAdminHandler.java:699)\n\t... 28 more\nCaused by: java.io.FileNotFoundException: File does not exist

 

we are using CM 5.5.3 and CDH 5.4.9 and solr 4.10.3 

Cloudera Employee
Posts: 172
Registered: ‎01-09-2014

Re: SOLR_CORE_STATUS_COLLECTION_HEALTH has become bad

The root cause is:

Caused by: java.io.FileNotFoundException: File does not exist

 

You need to look in the server logs on that host, it should indicate which file it is referring to.

 

-pd

Explorer
Posts: 9
Registered: ‎07-10-2016

Re: SOLR_CORE_STATUS_COLLECTION_HEALTH has become bad

i am getting this error on cloudera-scm-agent.log 

 

[26/Feb/2017 01:11:38 +0000] 134826 Monitor-SolrServerMonitor throttling_logger ERROR Error fetching Solr core status at 'http://hostname:8983/solr//admin/cores?wt=json&action=STATUS'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.8.2-py2.6.egg/cmf/monitor/solrserver/__init__.py", line 349, in _collect_core_status_from_url
openedUrl = self.urlopen(url, username=username, password=password)
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.8.2-py2.6.egg/cmf/monitor/abstract_monitor.py", line 368, in urlopen
self._agent.safety_valve))
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.8.2-py2.6.egg/cmf/url_util.py", line 204, in urlopen_with_retry_on_authentication_errors
return function()
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.8.2-py2.6.egg/cmf/monitor/abstract_monitor.py", line 364, in <lambda>
password=password),
File "/usr/lib64/cmf/agent/build/env/lib/python2.6/site-packages/cmf-5.8.2-py2.6.egg/cmf/url_util.py", line 67, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.6/urllib2.py", line 391, in open
response = self._open(req, data)
File "/usr/lib64/python2.6/urllib2.py", line 409, in _open
'_open', req)
File "/usr/lib64/python2.6/urllib2.py", line 369, in _call_chain
result = func(*args)
File "/usr/lib64/python2.6/urllib2.py", line 1190, in http_open
return self.do_open(httplib.HTTPConnection, req)
File "/usr/lib64/python2.6/urllib2.py", line 1165, in do_open
raise URLError(err)

Highlighted
Explorer
Posts: 9
Registered: ‎07-10-2016

Re: SOLR_CORE_STATUS_COLLECTION_HEALTH has become bad

Yes its happening for multiple SOLR servers and able to open this 'http://{Hostname}:8983/solr//admin/cores?wt=json&action=STATUS' in a browser.

 

CM version: 5.8.2

SOLR version : 4.10.3

Announcements