Created on 02-07-2017 01:37 PM - edited 09-16-2022 04:02 AM
After upgrading from CDH 5.7.5 to 5.10.0, Hue's Webserver Status gives error message:
"Bad : The Cloudera Manager Agent is not able to communicate with this role's web server."
Hue is up-and-running, I just think that the CM agent is unable to see it.
I found this in /var/log/cloudera-scm-agent/cloudera-scm-agent.log on the Hue server:
[07/Feb/2017 15:42:09 +0000] 3747 Monitor-GenericMonitor throttling_logger ERROR (59 skipped) Error calling is alive at 'https://0.0.0.0:8888/desktop/debug/is_alive'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 39, in collect_and_parse
self._call_is_alive(self._metrics_uri)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 76, in _call_is_alive
head_request_with_timeout(is_alive_url, timeout=timeout)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 94, in head_request_with_timeout
max_cert_depth)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 67, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 437, in open
response = meth(req, response)
File "/usr/lib64/python2.7/urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python2.7/urllib2.py", line 475, in error
return self._call_chain(*args)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/https.py", line 205, in http_error_default
raise e
HTTPError: HTTP Error 400: BAD REQUEST
Any help would be most appreciated.
Thanks,
-jon
Created 02-07-2017 04:43 PM
Hello @jehalter,
Actually, it is an interesting one. The problem here is that in CDH 5.10, by default, Hue will restrict access to hosts based on the value of "socket.getfqdn()"
Based on the URL that is being used to try to contact Hue, it seems you have some hostname/dns issues here:
Error calling is alive at 'https://0.0.0.0:8888/desktop/debug/is_alive'
In any case, apart from maybe taking a look at how your host and domain are configured on the host, it appears that Hue is blocking the connection because the client's domain does not match allowed_hosts.
In short, I would recommend trying to run the following:
# python -c 'import socket;print socket.getfqdn()'
Then, take the result and add it to your current list of allowed_hosts.
To do so:
(1)
In Cloudera Manager, navigate to Hue --> Configuration and then search for "Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini"
(2)
In the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini text field, add the following:
[desktop]
allowed_hosts=example.com
(replace "example.com" with the domain of your host)
If preferred and you just want to get this working, you can open up connections from all hosts by adding this instead:
[desktop]
allowed_hosts=*
(3)
Restart Hue
CAUSE:
In CDH 5.10, security was tightened in Hue by only allowing clients that are in the same domain as Hue to connect to Hue. It appears your hostname is actually 0.0.0.0 on the host, so the real fix would be to ensure that the host is configued to have a valid hostname and FQDN.
Let us know if you have questions about the steps or they don't work.
Ben
Created 02-07-2017 05:12 PM
Created 02-07-2017 08:56 PM
Created 02-08-2017 07:24 AM
Ben-- Thanks for the info on the increased security in 5.10; it's good to know. I tried your suggestions, both"allowed_hosts=<hostname>" and "allowed_hosts=*" in the "Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini" , restarted Hue, and neither helped.
Just FYI, my client is currently the Hue server, so I'm running Firefox on the same machine that the Hue server is on. I want to reiterate: I am not having trouble connecting to Hue with my client; the CM agent is having trouble communicating with Hue's web server.
To answer your followup question, our distribution is not SLES; we're running RHEL 7.2.
Thanks,
Jonathan
Created 02-08-2017 08:10 AM
Thank you for the information and I'm sorry to hear that it didn't help.
I did understand that you were not having trouble with the browser; the agent acts as a client of Hue, so I was considering it likely that the new allowed_hosts feature was blocking the agent from connecting to Hue.
Just to be sure, you added the following to the [desktop] section in the safety valve:
allowed_hosts=*
If you don't add the [desktop] section header before it, then the configuration will be ignored. As long as "allowed_hosts=*" is in the [desktop] section and you restarted, I'd assess that to be a clean test.
I'll let you know what we might try next.
Ben
Created 02-08-2017 08:17 AM
Yes, most definitely added the header. This is what my "Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini" looks like:
[desktop]
allowed_hosts=*
[hbase]
hbase_conf_dir={{HBASE_CONF_DIR}}
[beeswax]
use_get_log_api=true
Thanks,
Jonathan
Created 02-08-2017 09:34 AM
This gets more interesting.
The URL used in the hue health check is generated based on configurations returned by the Cloudera Manager heartbeat response. I tried overriding the "http_host" config in the Hue safety valve to set it to 0.0.0.0 but that did not alter my URL at all.
I checked and indeed the heartbeat response does not include the value for http_host. I think this is a Cloudera Manager bug/issue, but I'll need to take a closer look.
That said, I decided to hack the hue_adapters.py code to just use the host string "0.0.0.0" and reproduced the issue.
The question I have then, is how is Cloudera Manager reporting "0.0.0.0". That indicates to me that there is some problem with the hostname of the host as it appears in your Cloudera Manager Hosts tab. Cloudera Manager will attempt to use the hostname that is reported by the agent to set Hue's http_host value to the hostname on which Hue is running. If it is not able to do so, it is possible it leaves the Hue default which is "0.0.0.0"
I'll keep looking, but please check what your Hue host is named in Cloudera Manager's Hosts tab.
Thanks!
Ben
Created 02-08-2017 09:47 AM
Created 02-08-2017 10:16 AM
@bgooleyand @mbigelow -- thanks for you help on this!
Here is the fqdn, as given by the python command:
$ python -c 'import socket;print socket.getfqdn()'
bsl-ib-c4.uncc.edu
Which matches the hostname in the hosts tab in CM:
The CM Host Inspector is happy with all hosts, including this one.
Would it be worth moving Hue to another machine in the cluster, to see if that changes anything? Btw this is a test cluster that I am using to try out 5.10.0 before rolling it into production, so I have the luxury of trying just about anything without affecting any users.
Created 02-08-2017 10:26 AM
I tested by forcing the agent code to use "0.0.0.0" as the http_host for the Hue health check.
This generated the exception you saw when the check was done:
[08/Feb/2017 09:27:22 +0000] 7376 Monitor-GenericMonitor throttling_logger ERROR Error calling is alive at 'http://0.0.0.0:8888/desktop/debug/is_alive'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 39, in collect_and_parse
self._call_is_alive(self._metrics_uri)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 76, in _call_is_alive
head_request_with_timeout(is_alive_url, timeout=timeout)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 94, in head_request_with_timeout
max_cert_depth)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 67, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 437, in open
response = meth(req, response)
File "/usr/lib64/python2.7/urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python2.7/urllib2.py", line 475, in error
return self._call_chain(*args)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/https.py", line 205, in http_error_default
raise e
HTTPError: HTTP Error 400: BAD REQUEST
I then added the following to the following to my Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini, restarted Hue from Cloudera Manager, and the problem was resolved... the health check ran fine:
[desktop]
allowed_hosts=*
My assessment, then, is that either Hue was not restarted from Cloudera Manager or the restart failed. "allowed_hosts=*" will allow any clients to connect.
If anyone has other ideas why my test results differ (with the allowed_hosts configuration change) please let us know.
Actually, in your Hue UI, please verify the "allowed_hosts" configuration that Hue is using in memory. That will tell us.
- Log into Hue as a Hue superuser.
- Click on the "Configuration" subtab
- Under "Configuration Sections and Variables" click "desktop"
- Search with your browser search for "http_host" (record the values and share with us)
- Search with your browser for "allowed_hosts" (record the values and share with us)
Screen shots would be great. Here are screen shots of mine:
Created 02-08-2017 10:43 AM
I checked Hue's in-memory values of "http_host" and "allowed_hosts", and they are as follows:
Created 02-08-2017 10:45 AM
I looked at how http_host was set in Cloudera Manager for Hue and found that if Bind Hue Server to Wildcard Address is set in the Cloudera Manager Hue configuration, then the monitor will use the wildcard (0.0.0.0). I backed out my hack of the agent code and can now get the same behavior.
Still, my agent has no trouble talking to Hue even with "Bind Hue Server to Wildcard Address" set in Cloudera Manager as long as I have "allowed_hosts=*" set.
I feel like there is a bug here... I'll sort it out and make sure we address it in the long term.
Let me know if any of this helps and if you can get screen shots of your Hue configuration.
Thanks,
Ben
Created 02-08-2017 11:27 AM
Interesting that you should mention "Bind Hue Server to Wildcard Address". I unchecked that in my Hue configuration, restarted, and the problem went away. I waited a while; well past the point in which the error would typically occur, and figured that it was not going to surface.
Just as a test, I re-checked "Bind Hue Server to Wildcard Address", restarted Hue, and the problem came back.
Just thought I'd add this info to the thread.
Created 02-08-2017 11:52 AM
When you posted your screen shots, that had to have been after you unchecked Bind Hue Server to Wildcard Address as Hue would have then showed http_host as "0.0.0.0".
The only oddity in your case seems to be that allowed_hosts=* is not working for you. I belive it was working, but the agent monitor thread had already cached the http_host so the agent continued to use 0.0.0.0.
I would recommend (to you, too, @SRG) trying the following with Bind Hue Server to Wildcard Address set to true:
(1)
Edit Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini in Cloudera Manager
Make sure the desktop section has:
[desktop]
allowed_hosts=*
(2)
Save the configuration change
(3)
Restart Hue from Cloudera Manager
(4)
Restart the agent on the host where Hue runs with:
service cloudera-scm-agent restart
(5)
Let us know if that allows the Hue host check to succeed. If it fails, please see if you can find the stack/error in the /var/log/cloudera-scm-agent/cloudera-scm-agent.log file.
If that works for all of us, then we can maybe tweak the allowed_hosts value to be more secure but still let the agent do its health check.
Thank you,
Ben
Created on 02-08-2017 12:37 PM - edited 02-08-2017 12:39 PM
No, actually that is *not* the case. My screenshots are from when the "Bind Hue Server to Wildcard Address" was still checked. The reason my http_host in Hue shows the actual server name is because of a setting I had from @mbigelow's first message: "In the same safety valve, Hue Server Advanced Configuration Snippet (Safety Valve) for hue_safety_valve_server.ini, added http_host under the desktop section.."
So I still had the above change in my Hue Server Advanced Configuration Snippet (Safety Valve) for hue_safety_valve_server.ini file, which made my screenshot look the way it did. If I was not supposed to have that setting still set, I apologize.
One thing I noticed when I unchecked the Wildcard setting is that there were 2 files that shows differences: hue.ini and cloudera-monitor.properties (well 3 if you count the binary file, creds.localjceks):
I suspect the update to cloudera-monitor.properties is affecting the way the CM agent is comminucating with Hue's web server, because when defined the host in the hue_safety_valve_server.ini, CM still exhibited the error, even though according to Hue, my http_host was bsl-ib-c4.uncc.edu. But when I uncheck the Bind to Wildcard setting, http_host was changed for both Hue and CM.
Created 02-08-2017 12:56 PM
Indeed, the cloudera-monitor.properties includes the configuration. Hadn't considered it before you brought that up, but since the agent pulls from that file for its configuration, after you have started Hue, you can edit that file and set the "http_host" to what you want. A restart of the Cloudera Manager agent on the Hue host would pick up your changes when it initializes the monitor.
Setting http_host will not impact this issue as Cloudera Manager sets its own host to use for Hue monitoring.
I've opened a Jira internally here with the CM team to assess how to fix this long-term.
For now, we need to make sure the "allowed_hosts" workaround is working or, if you are OK with disabling the Bind Hue Server to Wildcard Address setting so that Hue listens on a specifc hostname, then we can just go with that as another workaround.
I consistently have good luck with the allowed_hosts=* fixing this and have also adopted the following to retain some security:
[desktop]
allowed_hosts=.cloudera.com,0.0.0.0,192.168.114.79
The above allows access from any host in the "cloudera.com" domain who is connecting to either 0.0.0.0 or 192.168.114.79
Alternatively, you could do the following:
(1)
Start Hue (observe the bad health)
(2)
cd to the following directory:
/var/run/cloudera-scm-agent/process/`ls -lrt /var/run/cloudera-scm-agent/process/ | awk '{print $9}' |grep HUE_SERVER| tail -1`
(3)
Edit cloudera-monitor.properties
change "http_host" value to the value returned by "hostname -f"
Save
(4)
Restar the agent on that host with:
service cloudera-scm-agent restart
This kind of thing won't last long, though, since the cloudera-monitor.properties will be recreated when you restart Hue again. Probaby not the best solution in that case.
Ben
Created 02-08-2017 01:00 PM
CLARIFICATION:
When I said:
"Setting http_host will not impact this issue as Cloudera Manager sets its own host to use for Hue monitoring."
That was a bit out of context.
I meant that if you set "http_host" in the Hue config [desktop] section, it will not impact the "http_host" chosen by Cloudera Manager if Bind Hue Server to Wildcard Address is enabled in the Hue configuration.
-Ben
Created 02-09-2017 10:06 AM
OK. So this is where I am at.
I must keep the Bind Hue Server to Wildcard Address setting checked, because our Hue server is multi-homed; users accesses Hue from our "public" network, while the CM agent checks Hue on the "private" network.
Even with "allowed_hosts=*" in the [desktop] section of the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini, I still observe the issue that the CM agent is not able to communicate with the role's web server. (I don't know why I still observe the issue, yet you do not).
Following the "Alternative" steps in your previous post, I 1) start Hue (observe the bad health), 2) cd to the active Hue server directory under /var/run/cloudera-scm-agent/..., 3) edit the cloudera-monitor.properties, replacing the value for http_host (0.0.0.0) with the actual hostname (bsl-ib-c4.uncc.edu), and 4) restart cloudera-scm-agent. The makes issue goes away, well, until the files get rewritten.
The only other way I have been able to make the issue go away is to uncheck the Bind Hue Server to Wildcard Address setting, but that is not an option in our current configuration.
Thank you for all of your help on this matter, @bgooley and @mbigelow.
Created 11-02-2017 10:40 PM
I have the same issue. Is there any solution/workaround to this?
Created 12-12-2017 06:43 AM
You can follow the solution in previous page, it works!
By the way, I have just upgraded to CM and CDH 5.13.1 and this issue is still present. Will it be fixed?