- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HUE server not able to send any kind of HTTP(GET, etc) request to WEBHDFS
- Labels:
-
Apache Hadoop
-
Apache Oozie
-
Cloudera Hue
Created on ‎08-28-2016 10:28 PM - edited ‎09-16-2022 03:37 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Dear HUE community,
We at NOKIA technologies, have been using the cloudera stack for our analytics based products since a few months. We have had a very good experience using it, specially the HUE enabled OOZIE workflow designer. It has made our lives much easier than before.
We have now decided to use HUE for the same purpose on our old clusters that do not run cloudera.
We followed instructions and installed HUE(and other required stuff) on our old clusters. But when start the UI in a browser, we see that HUE server just cannot communicate with webhdfs.
That webhdfs works fine is known from the fact the same GET command copied from HUE logs, works fine when run from the browser window and an entry is logged in the webhdfs log.
That hue cannot communicate with webhdfs is known because no request entry is logged in the webhdfs log when hue file browser is accessed via the HUE UI.
I have checked superuser configuration and privileges - and that is all fine. For simplicity, we have just one user for all services, which is also the hadoop superuser.
I paste the relevant logs below:
.
.
.
[28/Aug/2016 22:21:33 -0700] middleware DEBUG {"1472473293": {"status": 200, "impersonator": null, "service": "jobbrowser", "url": "/jobbrowser/", "user": "spark1", "ip_address": "135.x.x.x", "authorization_failure": false}}
[28/Aug/2016 22:22:04 -0700] access INFO 135.x.x.x spark1 - "GET /jobbrowser/ HTTP/1.1"
[28/Aug/2016 22:22:04 -0700] connectionpool DEBUG "GET http://XXXX:8088/ws/v1/cluster/apps?limit=10000&user=spark1&finalStatus=UNDEFINED HTTP/1.1" 302 90
[28/Aug/2016 22:22:04 -0700] connectionpool INFO Resetting dropped connection: proxy.XXXXXX.com
[28/Aug/2016 22:22:10 -0700] connectionpool DEBUG "GET http://www.XXXX.com/ws/v1/cluster/apps?limit=10000&user=spark1&finalStatus=UNDEFINED HTTP/1.1" 404 1245
.
.
.
.
.
.
(error 404): Traceback (most recent call last):
File "/opt/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/core/handlers/base.py", line 112, in get_response
response = wrapped_callback(request, *callback_args, **callback_kwargs)
File "/opt/hue/build/env/lib/python2.6/site-packages/Django-1.6.10-py2.6.egg/django/db/transaction.py", line 371, in inner
return func(*args, **kwargs)
File "/opt/hue/apps/jobbrowser/src/jobbrowser/views.py", line 121, in jobs
raise ex
RestException: <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1"/>
<title>404 - File or directory not found.</title>
<style type="text/css">
Created ‎08-28-2016 11:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, found the problem...
It seems the HUE server sends the request to webhdfs via the company proxy server - even though the HUE server is setup on the same host as the webhdfs server.
The proxy server obviously didn't know the hostname of the webhdfs server so it couldn't forward the request to it.
Changed the hostname to its public IP address in HUE.ini and everything works.
If anyone knows how to allow HUE server to bypass that proxy, please let me know.
Created ‎08-28-2016 11:01 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Well, found the problem...
It seems the HUE server sends the request to webhdfs via the company proxy server - even though the HUE server is setup on the same host as the webhdfs server.
The proxy server obviously didn't know the hostname of the webhdfs server so it couldn't forward the request to it.
Changed the hostname to its public IP address in HUE.ini and everything works.
If anyone knows how to allow HUE server to bypass that proxy, please let me know.
Created ‎09-07-2016 05:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
[hadoop]
[[hdfs_clusters]]
[[[default]]]
webhdfs_url=http://localhost:50070/webhdfs/v1
So, how about you put localhost there?
