Member since
09-08-2017
27
Posts
11
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7820 | 08-22-2018 10:42 PM |
01-03-2023
12:07 PM
Ssumers, Can you please share the value you added in gateway.dispatch.whitelist?
... View more
10-24-2021
08:37 PM
How to install docker version of sandbox on Mac for hdp-sandbox 2.6.5? I kept getting "502 bad gateway" when try to run Ambari page. Even when I am on Dashboard page, I end up with many red flags.
... View more
12-20-2017
06:33 PM
@Phil Zampino, this is a really informative and valuable article. Thanks for writing. Keep it up !
... View more
12-13-2017
10:59 PM
4 Kudos
Introduction The 0.14.0 release of Apache Knox includes the ability to dynamically determine topology endpoints for Hadoop services in Ambari-managed clusters. Prior to this release, users had to determine each of these endpoint URLs by navigating the Ambari UI (or combing through the various cluster configuration files), and explicitly add them to their topology descriptors; there was a lot of potential for human error. Support for a new, simplified topology descriptor has been added to leverage this dynamic endpoint discovery and facilitate provider configuration sharing across topologies. This is a dramatic improvement in the usability of Knox. Simplified Descriptors Simplified descriptors are a means to facilitate provider configuration sharing and service endpoint discovery. Rather than editing an XML topology descriptor, it’s now possible to create a simpler descriptor that declaratively specifies the desired contents of a topology, which will ultimately yield a full topology descriptor and corresponding deployment. These simplified descriptors allow service URLs to be specified explicitly, just as full topology descriptors do. However, if URLs are omitted for a service, Knox will attempt to discover that service’s URLs from the Hadoop cluster. Currently, this behavior is only supported for clusters managed by Ambari. Descriptor Properties Property Description discovery-address The endpoint address for the discovery source discovery-type The discovery source type. (Currently, the only supported type is AMBARI) discovery-user The username with permission to access the discovery source. If omitted, then Knox will check for an alias named ambari.discovery.user, and use its value if defined. discovery-pwd-alias The alias of the password for the user with permission to access the discovery source. If omitted, then Knox will check for an alias named ambari.discovery.password, and use its value if defined. provider-config-ref A reference to a provider configuration in {GATEWAY_HOME}/conf/shared-providers/ cluster The name of the cluster from which the topology service endpoints should be determined services The collection of services to be included in the topology File Formats Two file formats are supported for two distinct purposes: Format Purpose YAML intended for the individual hand-editing a simplified descriptor (because of its readability and support for comments) JSON intended to be used for API interaction YAML Example (based on the HDP Docker Sandbox) ---
discovery-address : http://sandbox.hortonworks.com:8080
discovery-user : maria_dev
discovery-pwd-alias : ambari.discovery.password
provider-config-ref : sandbox-providers
cluster: Sandbox
services:
- name: NAMENODE
- name: JOBTRACKER
- name: WEBHDFS
- name: WEBHCAT
- name: OOZIE
- name: WEBHBASE
- name: HIVE
- name: RESOURCEMANAGER A Note About Aliases This example illustrates the specification of credentials for the interaction with Ambari. If no credentials are specified, then the default aliases are queried. Use of the default aliases is sufficient for scenarios where topology discovery will only interact with a single Ambari instance. For multiple Ambari instances however, each will most likely require a different set of credentials. The discovery-user and discovery-pwd-alias properties exist for this purpose. Whether using the default credential aliases or specifying a custom password alias, these aliases must be defined prior to any attempt to deploy a topology using a simplified descriptor. Externalized Provider Configurations Sometimes, the same provider configuration is applied to multiple Knox topologies. Unlike XML topology descriptors, simplified descriptors do not contain provider configuration; rather, they contain references to external provider configuration. With the provider configuration externalized from the simple descriptors, a single configuration can be applied to multiple topologies. This helps reduce the duplication of configuration, and the need to update multiple configuration files when a policy change is required. Updating a provider configuration triggers an update to all those topologies that reference it. The contents of externalized provider configuration is identical to the gateway element from a full topology descriptor. The only difference is that it’s defined in its own XML file in {GATEWAY_HOME}/conf/shared-providers/. Monitored Directories Effecting topology changes is as simple as modifying files in two specific directories. The {GATEWAY_HOME}/conf/shared-providers/ directory is the location where Knox looks for provider configurations. This directory is monitored for changes, such that modifying a provider configuration file therein will trigger updates to any referencing simplified descriptors in the {GATEWAY_HOME}/conf/descriptors/ directory. Care should be taken when deleting these files if there are referencing descriptors; any subsequent modifications of referencing descriptors will fail when the deleted provider configuration cannot be found. The references should all be modified before deleting the provider configuration. Likewise, the {GATEWAY_HOME}/conf/descriptors/ directory is monitored for changes, such that adding or modifying a simplified descriptor file in this directory will trigger the generation and deployment of a topology. Deleting a descriptor from this directory will conversely result in the undeployment of the previously-generated topology. Generated Topologies Generated topology XML descriptors include an element to indicate the fact that they've been generated. <generated>true</generated> These generated topology XML files should not be modified directly. Any changes that are made could potentially be overwritten as a result of a change to the source descriptor, a change to the cluster configuration, or a gateway restart. While deleting a generated topology file will result in an undeployment of that topology, any of the aforementioned changes could result in the regeneration and deployment of that topology. The Admin API and Admin UI disallow modifications to generated topologies. The Admin API does provide the ability to modify simple descriptors and provider configurations, and the Admin UI will provide a similar capability in the future. The only reliable means to modify generated topologies is through changes to their respective source descriptors and provider configurations, either directly on the gateway host or using the Admin API. Admin API The Admin API has been augmented to support the management of provider configuration and simplified descriptor resources. Get a list of the current provider configurations deployed to the gateway: /gateway/admin/api/v1/providerconfig Get/Put/Delete the provider configuration identified by {id}: /gateway/admin/api/v1/providerconfig/{id} Get a list of the current descriptors deployed to the gateway: /gateway/admin/api/v1/descriptors Get/Put/Delete the descriptor identified by {id}: /gateway/admin/api/v1/descriptors/{id} For more complete API details, see the Admin API section of the user guide Try It! 0. Install the HDP Sandbox (https://hortonworks.com/downloads/#sandbox) 1. Create the discovery aliases {GATEWAY_HOME}/bin/knoxcli.sh create-alias ambari.discovery.user --value maria_dev
{GATEWAY_HOME}/bin/knoxcli.sh create-alias ambari.discovery.password --value maria_dev 2. Start the demo LDAP server and the gateway {GATEWAY_HOME}/bin/ldap.sh start
{GATEWAY_HOME}/bin/gateway.sh start 3. Create/copy a provider config to the {GATEWAY_HOME}/conf/shared-providers/ directory Sample sandbox-providers.xml <gateway>
<provider>
<role>authentication</role>
<name>ShiroProvider</name>
<enabled>true</enabled>
<param>
<name>sessionTimeout</name>
<value>30</value>
</param>
<param>
<name>main.ldapRealm</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
</param>
<param>
<name>main.ldapContextFactory</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value>
</param>
<param>
<name>main.ldapRealm.contextFactory</name>
<value>$ldapContextFactory</value>
</param>
<param>
<name>main.ldapRealm.userDnTemplate</name>
<value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.url</name>
<value>ldap://localhost:33389</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.authenticationMechanism</name>
<value>simple</value>
</param>
<param>
<name>urls./**</name>
<value>authcBasic</value>
</param>
</provider>
</gateway> 4. Create/copy a simple descriptor to the descriptors directory (you can use the YAML sample presented earlier in this article) cp simple-sandbox_y.yml {GATEWAY_HOME}/conf/descriptors/ 5. Verify {GATEWAY_HOME}/logs/gateway.log and the contents of the {GATEWAY_HOME}/conf/topologies directory. There should be a file named simple-sandbox_y.xml in the topologies directory. 6. Test the deployed topology by invoking a request to a proxied Hadoop service curl -iku guest:guest-password 'https://localhost:8443/gateway/simple-sandbox_y/webhdfs/v1/?op=LISTSTATUS' 7. Modify the provider config touch {GATEWAY_HOME}/conf/shared-providers/sandbox-providers.xml 8. Test (check the timestamps of {GATEWAY_HOME}/conf/descriptors/simple-sandbox_y.yml and {GATEWAY_HOME}/conf/topologies/simple-sandbox_y.xml, and {GATEWAY_HOME}/logs/gateway.log to verify the regeneration and redeployment of the topology) 9. Modify the descriptor touch {GATEWAY_HOME}/conf/descriptors/simple-sandbox_y.yml 10. Check the timestamp of {GATEWAY_HOME}/conf/topologies/simple-sandbox_y.xml, and {GATEWAY_HOME}/logs/gateway.log to verify the regeneration and redeployment of the topology. 11. Delete the simple descriptor from {GATEWAY_HOME}/conf/descriptors (Verify the removal of {GATEWAY_HOME}/conf/topologies/simple-sandbox_y.xml, and check {GATEWAY_HOME}/logs/gateway.log to verify undeployment of the topology) 12. Repeat steps 3-11 using the Admin API instead of filesystem copies a. Deploy the provider configuration to the gateway: curl -iku admin:admin-password https://localhost:8443/gateway/admin/api/v1/providerconfig/sandbox-providers -X PUT -H Content-Type:application/xml -d "@sandbox-providers.xml" b. The API requires descriptors to be in the JSON format: simple-sandbox_j.json {
"discovery-address":"http://localhost:8080",
"provider-config-ref":"sandbox-providers",
"cluster":"Sandbox",
"services":[
{"name":"NAMENODE"},
{"name":"JOBTRACKER"},
{"name":"WEBHDFS"},
{"name":"WEBHCAT"},
{"name":"OOZIE"},
{"name":"WEBHBASE"},
{"name":"RESOURCEMANAGER"}
]
} Deploy the JSON descriptor to the gateway: curl -iku admin:admin-password https://localhost:8443/gateway/admin/api/v1/descriptors/simple-sandbox -X PUT -H Content-Type:application/json -d "@simple-sandbox_j.json" c. Test the resulting deployed topology curl -iku guest:guest-password 'https://localhost:8443/gateway/simple-sandbox/webhdfs/v1/?op=LISTSTATUS' d. Try to delete the provider configuration (It should be disallowed because simple-sandbox_j.json references it): curl -iku admin:admin-password 'https://localhost:8443/gateway/admin/api/v1/providerconfig/sandbox-providers' -X DELETE e. Delete the referencing descriptor: curl -iku admin:admin-password 'https://localhost:8443/gateway/admin/api/v1/descriptors/simple-sandbox' -X DELETE f. Then, try to delete the provider configuration again. It should succeed this time because there are no referencing descriptors. Summary Hopefully, the benefits of this new functionality are clear. Defining and deploying topologies for Ambari-managed Hadoop clusters is now easier and less error-prone. Provider configurations can now be shared by multiple topologies, reducing duplicate configuration and the associated potential for errors in managing changes. There are related UI enhancements coming soon, which will further ease the management of topologies, and continue the enhancement of Knox's usability. Check out the User Guide for more details about these additions.
... View more
Labels:
09-14-2017
06:59 PM
+1. Very useful article, bookmarked. Thank you!
... View more
12-24-2018
05:38 PM
I followed this with HDP 2.6.5 and the HBaseUI became accessible in the given URL but has many errors and links not working inside. I posted a question on how to fix this and then the answer resolving most of these issues here: https://community.hortonworks.com/questions/231948/how-to-fix-knox-hbase-ui.html You are welcome to test this and include these fixes in your article if you find it appropriate. Best regards
... View more