Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Hue: The Cloudera Manager Agent is not able to communicate with this role's web server.

avatar
Contributor

After upgrading from CDH 5.7.5 to 5.10.0, Hue's Webserver Status gives error message:

 

"Bad : The Cloudera Manager Agent is not able to communicate with this role's web server."

 

Hue is up-and-running, I just think that the CM agent is unable to see it.

 

I found this in /var/log/cloudera-scm-agent/cloudera-scm-agent.log on the Hue server:

 

[07/Feb/2017 15:42:09 +0000] 3747 Monitor-GenericMonitor throttling_logger ERROR (59 skipped) Error calling is alive at 'https://0.0.0.0:8888/desktop/debug/is_alive'
Traceback (most recent call last):
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 39, in collect_and_parse
self._call_is_alive(self._metrics_uri)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/monitor/generic/hue_adapters.py", line 76, in _call_is_alive
head_request_with_timeout(is_alive_url, timeout=timeout)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 94, in head_request_with_timeout
max_cert_depth)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/url_util.py", line 67, in urlopen_with_timeout
return opener.open(url, data, timeout)
File "/usr/lib64/python2.7/urllib2.py", line 437, in open
response = meth(req, response)
File "/usr/lib64/python2.7/urllib2.py", line 550, in http_response
'http', request, response, code, msg, hdrs)
File "/usr/lib64/python2.7/urllib2.py", line 475, in error
return self._call_chain(*args)
File "/usr/lib64/python2.7/urllib2.py", line 409, in _call_chain
result = func(*args)
File "/usr/lib64/cmf/agent/build/env/lib/python2.7/site-packages/cmf-5.10.0-py2.7.egg/cmf/https.py", line 205, in http_error_default
raise e
HTTPError: HTTP Error 400: BAD REQUEST

 

Any help would be most appreciated.

 

Thanks,

-jon

27 REPLIES 27

avatar
Master Guru

Hello @jehalter,

 

Actually, it is an interesting one.  The problem here is that in CDH 5.10, by default, Hue will restrict access to hosts based on the value of "socket.getfqdn()"

 

Based on the URL that is being used to try to contact Hue, it seems you have some hostname/dns issues here:

 

Error calling is alive at 'https://0.0.0.0:8888/desktop/debug/is_alive'

 

In any case, apart from maybe taking a look at how your host and domain are configured on the host, it appears that Hue is blocking the connection because the client's domain does not match allowed_hosts.

 

In short, I would recommend trying to run the following:

 

 

# python -c 'import socket;print socket.getfqdn()'

 

Then, take the result and add it to your current list of allowed_hosts.

To do so:

 

(1)

 

In Cloudera Manager, navigate to Hue --> Configuration and then search for "Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini"

 

(2)

 

In the Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini text field, add the following:

 

[desktop]

allowed_hosts=example.com

 

(replace "example.com" with the domain of your host)

 

If preferred and you just want to get this working, you can open up connections from all hosts by adding this instead:

 

[desktop]

allowed_hosts=*

 

(3)

 

Restart Hue

 

CAUSE:

 

In CDH 5.10, security was tightened in Hue by only allowing clients that are in the same domain as Hue to connect to Hue.  It appears your hostname is actually 0.0.0.0 on the host, so the real fix would be to ensure that the host is configued to have a valid hostname and FQDN.

 

Let us know if you have questions about the steps or they don't work.

 

Ben

avatar
Master Guru

@jehalter,

 

Actually, are you on SLES... this may be a bug

 

Thanks,

 

Ben

avatar
Champion
Doesn't it bind to IP address 0.0.0.0 by default? Or maybe it is bound to 0.0.0.0 in the configs. I may be mistaken on that. You can try binding it to the correct IP or hostname.

In the same safety valve, Hue Server Advanced Configuration Snippet (Safety Valve) for hue_safety_valve_server.ini, added http_host under the desktop section..

[desktop]
http_host=hue-host.example.com

avatar
Contributor

Ben-- Thanks for the info on the increased security in 5.10; it's good to know. I tried your suggestions, both"allowed_hosts=<hostname>" and "allowed_hosts=*" in the "Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini" , restarted Hue, and neither helped.

 

Just FYI, my client is currently the Hue server, so I'm running Firefox on the same machine that the Hue server is on. I want to reiterate: I am not having trouble connecting to Hue with my client; the CM agent is having trouble communicating with Hue's web server.

 

To answer your followup question, our distribution is not SLES; we're running RHEL 7.2.

 

Thanks,

Jonathan

avatar
Master Guru

@jehalter,

 

Thank you for the information and I'm sorry to hear that it didn't help.

I did understand that you were not having trouble with the browser; the agent acts as a client of Hue, so I was considering it likely that the new allowed_hosts feature was blocking the agent from connecting to Hue.

 

Just to be sure, you added the following to the [desktop] section in the safety valve:

 

allowed_hosts=*

 

If you don't add the [desktop] section header before it, then the configuration will be ignored.  As long as "allowed_hosts=*" is in the [desktop] section and you restarted, I'd assess that to be a clean test.

 

I'll let you know what we might try next.

 

Ben

avatar
Contributor

Yes, most definitely added the header. This is what my "Hue Service Advanced Configuration Snippet (Safety Valve) for hue_safety_valve.ini" looks like:

 

[desktop]
allowed_hosts=*
[hbase]
hbase_conf_dir={{HBASE_CONF_DIR}}
[beeswax]
use_get_log_api=true

 

Thanks,

Jonathan

avatar
Master Guru

@jehalter and @mbigelow,

 

This gets more interesting.

The URL used in the hue health check is generated based on configurations returned by the Cloudera Manager heartbeat response.  I tried overriding the "http_host" config in the Hue safety valve to set it to 0.0.0.0 but that did not alter my URL at all.

I checked and indeed the heartbeat response does not include the value for http_host.  I think this is a Cloudera Manager bug/issue, but I'll need to take a closer look.

 

That said, I decided to hack the hue_adapters.py code to just use the host string "0.0.0.0" and reproduced the issue.

 

The question I have then, is how is Cloudera Manager reporting "0.0.0.0".  That indicates to me that there is some problem with the hostname of the host as it appears in your Cloudera Manager Hosts tab.  Cloudera Manager will attempt to use the hostname that is reported by the agent to set Hue's http_host value to the hostname on which Hue is running.  If it is not able to do so, it is possible it leaves the Hue default which is "0.0.0.0"

 

I'll keep looking, but please check what your Hue host is named in Cloudera Manager's Hosts tab.

 

Thanks!

 

Ben

avatar
Champion
oh wow, I think I confirmed it in CDH 5.8.2 on RHEL 7.1.

Is this a related log entry?

[12/Jan/2017 03:52:12 +0000] 60662 Monitor-GenericMonitor throttling_logger ERROR Error calling is alive at 'https://abo-lp3-exted01.wdc.com:8888/desktop/debug/is_alive'

@jehalter

Can you run the python command that Ben posted earlier?

avatar
Contributor

@bgooleyand @mbigelow -- thanks for you help on this!

 

Here is the fqdn, as given by the python command:

 

$ python -c 'import socket;print socket.getfqdn()'
bsl-ib-c4.uncc.edu

 

Which matches the hostname in the hosts tab in CM:

bsl-ib-c4.uncc.edu.png

 

The CM Host Inspector is happy with all hosts, including this one.

 

Would it be worth moving Hue to another machine in the cluster, to see if that changes anything? Btw this is a test cluster that I am using to try out 5.10.0 before rolling it into production, so I have the luxury of trying just about anything without affecting any users.