Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

Recommendation for proxying Hadoop services without built-in Knox support

avatar

Knox 0.6.0 has built-in support for these 7 services:

  • WebHDFS
  • WebHCat
  • Oozie
  • HBase
  • Hive
  • Yarn
  • Storm

Is there a recommended approach to expose other services from the gateway host? Particularly web UIs, such as Ambari & Ranger.

1 ACCEPTED SOLUTION

avatar

These extensions are committed to the Apache Knox repo itself. They all use the config driven extension model so you need to look in the gateway-service-definitions module. In particular look in this directory. Now that you mention the openweathermap example, I need to update that to the new configuration based model at least as a comparison to the code based extension. The developers guide does briefly cover the config based extension.

View solution in original post

13 REPLIES 13

avatar

avatar

I suppose you can use haproxy for example. However if you have kerberos and spnego you would need to add the proxy tickets similar to the oozie ha setup described here in the cloudera doc ( I would use ours if we would actually describe that ) http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/cdh_sg_oozie_ha_ker...

avatar

I figured something like haproxy or nginx would work. Preferably looking for an example config, or if anyone has extended Knox with a custom provider then even better.

avatar

We would certainly recommend the use of Knox's extensibility models to cover any components without coverage before we get there ourselves. There have been several developed in the community already such as Falcon that we don't yet officially support. The same goes for UI coverage where the community has added coverage for things like the HDFS and YARN UIs among others. The Knox Developer's Guide is a great resource that the community has used to help them jump start these efforts. Of course looking at the implementation of the existing integrations is a great place to start as well.

avatar

The openweathermap example in the Knox Dev Guide looks great as a reference for extending Knox yourself. Do you know where some existing community extensions, like the Falcon or NN/RM UIs, can be found? I checked the Hortonworks Gallery with no luck.

avatar

These extensions are committed to the Apache Knox repo itself. They all use the config driven extension model so you need to look in the gateway-service-definitions module. In particular look in this directory. Now that you mention the openweathermap example, I need to update that to the new configuration based model at least as a comparison to the code based extension. The developers guide does briefly cover the config based extension.

avatar

avatar

@Kevin Minder moving to the best answer.

avatar

For quick reference, here's an example of adding Oozie UI to HDP 2.4 Sandbox:

1. start Sandbox and make sure all non-maintenance services are running

2. add service definition:

git clone https://git-wip-us.apache.org/repos/asf/knox.git
cp -R knox/gateway-service-definitions/src/main/resources/services/oozieui /var/lib/knox/data-2.4.0.0-169/services/
chown -R knox:knox /var/lib/knox/data-2.4.0.0-169/services/oozieui

3. add OOZIEUI service to default.xml topology (Ambari > Knox > Configs > Advanced topology)

<service>
    <role>OOZIEUI</role>
    <url>http://{{oozie_server_host}}:{{oozie_server_port}}/oozie</url>
  </service>

4. start (or restart) Knox & Demo LDAP (using Ambari)

5. visit https://localhost:8443/gateway/default/oozie/

avatar
Expert Contributor

Thanks for sharing that! I followed your instructions and did the same for yarnui (adapting the paths slightly).

The root- and logs-redirection works, but many other redirections (especially those with {**} in the end) are not used by Knox.

Example: When calling https://172.18.10.163:8443/gateway/default/yarn, the site loads, but the static resources do not load. In /var/log/knox/gateway.log it says:

2016-05-12 11:13:34,109 DEBUG hadoop.gateway (GatewayFilter.java:doFilter(110)) - Received request: GET /yarn
2016-05-12 11:13:34,147 INFO  hadoop.gateway (KnoxLdapRealm.java:getUserDn(556)) - Computed userDn: uid=guest,ou=people,dc=hadoop,dc=apache,dc=org using dnTemplate for principal: guest
2016-05-12 11:13:34,227 INFO  hadoop.gateway (AclsAuthorizationFilter.java:init(62)) - Initializing AclsAuthz Provider for: YARNUI
2016-05-12 11:13:34,228 DEBUG hadoop.gateway (AclsAuthorizationFilter.java:init(70)) - ACL Processing Mode is: AND
2016-05-12 11:13:34,229 DEBUG hadoop.gateway (AclParser.java:parseAcls(59)) - No ACLs found for: YARNUI
2016-05-12 11:13:34,230 INFO  hadoop.gateway (AclsAuthorizationFilter.java:doFilter(85)) - Access Granted: true
2016-05-12 11:13:34,434 DEBUG hadoop.gateway (UrlRewriteProcessor.java:rewrite(155)) - Rewrote URL: https://172.18.10.163:8443/gateway/default/yarn, direction: IN via implicit rule: YARNUI/yarn/inbound/root to URL: http://resourcemanagerhost.local:8088/cluster
	[...]
2016-05-12 11:13:35,074 DEBUG hadoop.gateway (GatewayFilter.java:doFilter(110)) - Received request: GET /yarn/static/jquery/jquery-ui-1.9.1.custom.min.js
2016-05-12 11:13:35,417 DEBUG hadoop.gateway (GatewayFilter.java:doFilter(110)) - Received request: GET /yarn/static/jquery/jquery-1.8.2.min.js

That's the end of file. Nothing is logged after that.

I'm using HDP 2.3.4.7 with Knox 0.6.0.

I would appreciate your help, @Alex Miller or @Kevin Minder.

Thanks!

avatar

@Benjamin R Does it work if you add a trailing slash?

avatar
Expert Contributor

@Alex Miller This makes no difference. Still the page https://172.18.10.163:8443/gateway/default/yarn/ is loaded, but static resources or pages like https://172.18.10.163:8443/gateway/default/yarn/apps/ACCEPTED are not loaded.

edit: I found the error. In my topology file, I previously added a custom stanza (role. SERVICE-TEST) for which I created no service-definition. That made Knox behave weird. After removing that block, the YARN-UI over Knox works.

Thanks, Alex

avatar
New Contributor

If you're looking to improve access to back-end service UI's for the ops team, as opposed to exposing the services to the larger user base, we make use of ssh tunneling via our admin jump hosts to effectively create personal SOCKS proxies for each ops/admin user.

We then use one of the dynamic proxy config plugins in Chrome or Firefox to direct requests to those services based on hostname, or in our case the domain of the hadoop environment.

This has the advantage of being very transparent and service URL's all tend to resolve correctly , including https based services. The disadvantage is that the person using this approach needs to know how to setup an ssh tunnel and how to configure their browser to use that tunnel for the Hadoop services.

Labels