Created on 02-07-2017 01:37 PM - edited 09-16-2022 04:02 AM
After upgrading from CDH 5.7.5 to 5.10.0, Hue's Webserver Status gives error message:
"Bad : The Cloudera Manager Agent is not able to communicate with this role's web server."
Hue is up-and-running, I just think that the CM agent is unable to see it.
I found this in /var/log/cloudera-scm-agent/cloudera-scm-agent.log on the Hue server:
[07/Feb/2017 15:42:09 +0000] 3747 Monitor-GenericMonitor throttling_logger ERROR (59 skipped) Error calling is alive at 'https://0.0.0.0:8888/desktop/debug/is_alive'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 39, in collect_and_parse
self._call_is_alive(self._metrics_uri)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 76, in _call_is_alive
head_request_with_timeout(is_alive_url, timeout=timeout)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 94, in head_request_with_timeout
max_cert_depth)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 67, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 437, in open
response = meth(req, response)
File "/usr/lib64/python2.7/urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python2.7/urllib2.py", line 475, in error
return self._call_chain(*args)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/https.py", line 205, in http_error_default
raise e
HTTPError: HTTP Error 400: BAD REQUEST
Any help would be most appreciated.
Thanks,
-jon
Created 02-08-2017 10:26 AM
I tested by forcing the agent code to use "0.0.0.0" as the http_host for the Hue health check.
This generated the exception you saw when the check was done:
[08/Feb/2017 09:27:22 +0000] 7376 Monitor-GenericMonitor throttling_logger ERROR Error calling is alive at 'http://0.0.0.0:8888/desktop/debug/is_alive'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 39, in collect_and_parse
self._call_is_alive(self._metrics_uri)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 76, in _call_is_alive
head_request_with_timeout(is_alive_url, timeout=timeout)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 94, in head_request_with_timeout
max_cert_depth)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 67, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 437, in open
response = meth(req, response)
File "/usr/lib64/python2.7/urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python2.7/urllib2.py", line 475, in error
return self._call_chain(*args)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/https.py", line 205, in http_error_default
raise e
HTTPError: HTTP Error 400: BAD REQUEST
I then added the following to the following to my Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini, restarted Hue from Cloudera Manager, and the problem was resolved... the health check ran fine:
[desktop]
allowed_hosts=*
My assessment, then, is that either Hue was not restarted from Cloudera Manager or the restart failed. "allowed_hosts=*" will allow any clients to connect.
If anyone has other ideas why my test results differ (with the allowed_hosts configuration change) please let us know.
Actually, in your Hue UI, please verify the "allowed_hosts" configuration that Hue is using in memory. That will tell us.
- Log into Hue as a Hue superuser.
- Click on the "Configuration" subtab
- Under "Configuration Sections and Variables" click "desktop"
- Search with your browser search for "http_host" (record the values and share with us)
- Search with your browser for "allowed_hosts" (record the values and share with us)
Screen shots would be great. Here are screen shots of mine:
Created 02-08-2017 10:43 AM
I checked Hue's in-memory values of "http_host" and "allowed_hosts", and they are as follows:
Created 02-08-2017 10:45 AM
I looked at how http_host was set in Cloudera Manager for Hue and found that if Bind Hue Server to Wildcard Address is set in the Cloudera Manager Hue configuration, then the monitor will use the wildcard (0.0.0.0). I backed out my hack of the agent code and can now get the same behavior.
Still, my agent has no trouble talking to Hue even with "Bind Hue Server to Wildcard Address" set in Cloudera Manager as long as I have "allowed_hosts=*" set.
I feel like there is a bug here... I'll sort it out and make sure we address it in the long term.
Let me know if any of this helps and if you can get screen shots of your Hue configuration.
Thanks,
Ben
Created 02-08-2017 11:27 AM
Interesting that you should mention "Bind Hue Server to Wildcard Address". I unchecked that in my Hue configuration, restarted, and the problem went away. I waited a while; well past the point in which the error would typically occur, and figured that it was not going to surface.
Just as a test, I re-checked "Bind Hue Server to Wildcard Address", restarted Hue, and the problem came back.
Just thought I'd add this info to the thread.
Created 02-08-2017 11:52 AM
When you posted your screen shots, that had to have been after you unchecked Bind Hue Server to Wildcard Address as Hue would have then showed http_host as "0.0.0.0".
The only oddity in your case seems to be that allowed_hosts=* is not working for you. I belive it was working, but the agent monitor thread had already cached the http_host so the agent continued to use 0.0.0.0.
I would recommend (to you, too, @SRG) trying the following with Bind Hue Server to Wildcard Address set to true:
(1)
Edit Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini in Cloudera Manager
Make sure the desktop section has:
[desktop]
allowed_hosts=*
(2)
Save the configuration change
(3)
Restart Hue from Cloudera Manager
(4)
Restart the agent on the host where Hue runs with:
service cloudera-scm-agent restart
(5)
Let us know if that allows the Hue host check to succeed. If it fails, please see if you can find the stack/error in the /var/log/cloudera-scm-agent/cloudera-scm-agent.log file.
If that works for all of us, then we can maybe tweak the allowed_hosts value to be more secure but still let the agent do its health check.
Thank you,
Ben
Created on 02-08-2017 12:37 PM - edited 02-08-2017 12:39 PM
No, actually that is *not* the case. My screenshots are from when the "Bind Hue Server to Wildcard Address" was still checked. The reason my http_host in Hue shows the actual server name is because of a setting I had from @mbigelow's first message: "In the same safety valve, Hue Server Advanced Configuration Snippet (Safety Valve) for hue_safety_valve_server.ini, added http_host under the desktop section.."
So I still had the above change in my Hue Server Advanced Configuration Snippet (Safety Valve) for hue_safety_valve_server.ini file, which made my screenshot look the way it did. If I was not supposed to have that setting still set, I apologize.
One thing I noticed when I unchecked the Wildcard setting is that there were 2 files that shows differences: hue.ini and cloudera-monitor.properties (well 3 if you count the binary file, creds.localjceks):
I suspect the update to cloudera-monitor.properties is affecting the way the CM agent is comminucating with Hue's web server, because when defined the host in the hue_safety_valve_server.ini, CM still exhibited the error, even though according to Hue, my http_host was bsl-ib-c4.uncc.edu. But when I uncheck the Bind to Wildcard setting, http_host was changed for both Hue and CM.
Created 02-08-2017 12:56 PM
Indeed, the cloudera-monitor.properties includes the configuration. Hadn't considered it before you brought that up, but since the agent pulls from that file for its configuration, after you have started Hue, you can edit that file and set the "http_host" to what you want. A restart of the Cloudera Manager agent on the Hue host would pick up your changes when it initializes the monitor.
Setting http_host will not impact this issue as Cloudera Manager sets its own host to use for Hue monitoring.
I've opened a Jira internally here with the CM team to assess how to fix this long-term.
For now, we need to make sure the "allowed_hosts" workaround is working or, if you are OK with disabling the Bind Hue Server to Wildcard Address setting so that Hue listens on a specifc hostname, then we can just go with that as another workaround.
I consistently have good luck with the allowed_hosts=* fixing this and have also adopted the following to retain some security:
[desktop]
allowed_hosts=.cloudera.com,0.0.0.0,192.168.114.79
The above allows access from any host in the "cloudera.com" domain who is connecting to either 0.0.0.0 or 192.168.114.79
Alternatively, you could do the following:
(1)
Start Hue (observe the bad health)
(2)
cd to the following directory:
/var/run/cloudera-scm-agent/process/`ls -lrt /var/run/cloudera-scm-agent/process/ | awk '{print $9}' |grep HUE_SERVER| tail -1`
(3)
Edit cloudera-monitor.properties
change "http_host" value to the value returned by "hostname -f"
Save
(4)
Restar the agent on that host with:
service cloudera-scm-agent restart
This kind of thing won't last long, though, since the cloudera-monitor.properties will be recreated when you restart Hue again. Probaby not the best solution in that case.
Ben
Created 02-08-2017 01:00 PM
CLARIFICATION:
When I said:
"Setting http_host will not impact this issue as Cloudera Manager sets its own host to use for Hue monitoring."
That was a bit out of context.
I meant that if you set "http_host" in the Hue config [desktop] section, it will not impact the "http_host" chosen by Cloudera Manager if Bind Hue Server to Wildcard Address is enabled in the Hue configuration.
-Ben
Created 02-09-2017 10:06 AM
OK. So this is where I am at.
I must keep the Bind Hue Server to Wildcard Address setting checked, because our Hue server is multi-homed; users accesses Hue from our "public" network, while the CM agent checks Hue on the "private" network.
Even with "allowed_hosts=*" in the [desktop] section of the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini, I still observe the issue that the CM agent is not able to communicate with the role's web server. (I don't know why I still observe the issue, yet you do not).
Following the "Alternative" steps in your previous post, I 1) start Hue (observe the bad health), 2) cd to the active Hue server directory under /var/run/cloudera-scm-agent/..., 3) edit the cloudera-monitor.properties, replacing the value for http_host (0.0.0.0) with the actual hostname (bsl-ib-c4.uncc.edu), and 4) restart cloudera-scm-agent. The makes issue goes away, well, until the files get rewritten.
The only other way I have been able to make the issue go away is to uncheck the Bind Hue Server to Wildcard Address setting, but that is not an option in our current configuration.
Thank you for all of your help on this matter, @bgooley and @mbigelow.
Created 11-02-2017 10:40 PM
I have the same issue. Is there any solution/workaround to this?