Member since
09-08-2017
27
Posts
11
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1878 | 08-22-2018 10:42 PM |
04-09-2019
12:52 PM
Simply creating a provider configuration with LDAP config has no effect whatsoever on the running Knox instance, even less so since you're not able to save the configuration. Are you certain that restarts of Knox and the LDAP server are indeed required? The behavior you're describing sounds like what one sees when the SSO cookie expires, and reloading the page gives the opportunity to log in again. If you do indeed have to restart Knox and the LDAP server, then there is something more seriously wrong, well beyond the Admin UI.
... View more
04-08-2019
01:37 AM
Check the ttl param value in the knoxsso topology. I suspect it’s too short, and the cookie is timing out before you attempt to save the provider configuration.
... View more
03-21-2019
02:45 PM
Perhaps debug logging would be helpful. Can you set the log level to debug, and see if anything useful appears in gateway.log?
... View more
03-15-2019
03:13 PM
Do you see any messages in gateway.log about a whitelist?
... View more
03-14-2019
05:02 PM
So, you're able to access the Knox UI via the public IP, but login is failing? Can you see anything relevant in the Knox logs? Is Knox SSO in play here? What do you mean by this? "the topologies created from knox UI is throwing 404 error."
... View more
01-30-2019
05:41 PM
@abbas mohammadnejad You have to explicitly set the gateway.dispatch.whitelist property in gateway-site.xml, such that the pattern will match the endpoint address.
... View more
01-24-2019
03:26 PM
Yes, as I had said, it has been fixed, but has not yet been released. It will be included in the 1.3.0 Apache release.
... View more
01-24-2019
02:35 PM
This issue has been addressed (KNOX-1731), but has not yet been released.
... View more
10-09-2018
07:11 PM
1 Kudo
@NN Which version of Knox is being used? I'm asking because there have been a lot of improvements in this area recently.
... View more
10-03-2018
07:24 PM
It looks like you're missing the gateway path segment in the actual command: 'https://10.224.155.25:9443/Vamshi/webhdfs/v1/?op=LISTSTATUS' should be 'https://10.224.155.25:9443/gateway/Vamshi/webhdfs/v1/?op=LISTSTATUS' And have you intentionally configured the gateway port to 9443 (vs the default 8443)?
... View more
09-28-2018
03:36 PM
Is your sandbox topology pointing to your cluster correctly? What does your gateway-audit.log show?
... View more
09-28-2018
12:47 PM
Have you considered using the Ambari REST API, or are you concerned with clusters which are not managed by Ambari? You could use curl to invoke it from your scripts.
... View more
08-24-2018
09:00 PM
Thank you for following up. That's what I suspected, and it's good to document it here for future reference.
... View more
08-24-2018
02:26 AM
@Lian Jiang Can you explain why the default whitelist was not working for your deployment?
... View more
08-22-2018
10:42 PM
None of Ambari, Zeppelin, or RangerUI are affected by this whitelisting.
Can you see the default whitelist in gateway.log?
It should say something like Applying a derived dispatch whitelist because none is configured in gateway-site: xxxxxxx
... View more
02-03-2018
07:21 PM
@shashi cheppela Can you post your topology here? This is a little better explanation of what is required: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_security/content/setting_up_knox_services_ha.html
... View more
12-19-2017
07:33 PM
2 Kudos
Introduction My Apache Knox Dynamic Service Endpoint Discovery article describes some exciting new functionality available in the 0.14.0 release of Apache Knox. The gateway is now able to dynamically determine the endpoint URLs of cluster services to proxy from Ambari. The associated benefits are described in that article. Another benefit of this new functionality, which is not mentioned in that article, is the added ability to dynamically respond to cluster configuration changes that affect generated Knox topologies by re-generating and re-deploying those topologies. Without this, deployed topologies can be easily disabled when any of the proxied Hadoop services' configuration changes in the cluster. Cluster Monitoring When Knox deploys a simple topology descriptor, and generates a corresponding topology based on discovered cluster configuration details, it subsequently has the ability to monitor that cluster configuration for changes. When it discovers a change, it updates all of its topologies that are based on that modified cluster, and redeploys them. This has the potential to greatly reduce downtime for Knox due to cluster configuration changes. For example, suppose a descriptor (docker-sandbox.json) is deployed, intended to proxy services in the HDP Docker Sandbox. Following the successful generation and deployment of the docker-sandbox topology, Knox can monitor the Sandbox cluster managed by Ambari. If an administrator were to update the dfs.namenode.http-address property value in the hdfs-site configuration, changing the port number for example, the Knox proxy for the WEBHDFS service would no longer work. However, if the Ambari cluster monitor is enabled, Knox would regenerate and redeploy the docker-sandbox topology, such that it would contain the correct port for the WEBHDFS service URL, and Knox clients would continue to work. By default this monitor is disabled, but it can easily be enabled by setting the gateway.cluster.config.monitor.ambari.enabled property value to true in the gateway-site configuration. <property>
<name>gateway.cluster.config.monitor.ambari.enabled</name>
<value>true</value>
<description>Enable/disable Ambari cluster configuration monitoring.</description>
</property> Also in the gateway-site configuration, there is a property for controlling the frequency with which Knox will check the clusters for which it has deployed topologies. For demonstration purposes, you may want to set this as low as 20 or 30 seconds. <property>
<name>gateway.cluster.config.monitor.ambari.interval</name>
<value>60</value>
<description>The interval (in seconds) for polling Ambari for cluster changes.</description>
</property> Try It The Apache Knox Dynamic Service Endpoint Discovery article includes instructions for deploying topologies using simple descriptors, employing service URL discovery. Starting from there, you can enable the Ambari cluster monitoring, and make a cluster configuration change like the one described in this article. Then, you'll see how Knox responds to the change, and adapts to continue providing the proxied WEBHDFS service to its clients. 1. Set the gateway.cluster.config.monitor.ambari.enabled property value to true in {GATEWAY_HOME}/conf/gateway-site.xml 2. Restart the gateway 3. Use Ambari to modify the hdfs-site dfs.namenode.http-address configuration property value as described in the example. 4. Allow the gateway to notice the configuration change (watch the {GATEWAY_HOME}/logs/gateway.log for the messages) 5. Review {GATEWAY_HOME}/conf/topologies/docker-sandbox.xml, and notice the change to the WEBHDFS service URL. Your sandbox must expose the new port you specified for the dfs.namenode.http-address property for Knox to be able to access the new endpoint; otherwise, even though the topology will be correct, requests will fail due to connection failure. Summary While it doesn't take long to describe, this feature is a significant addition to the value provided by Knox. The ability to dynamically adapt to cluster service configuration changes reduces the effort required (and the potential for errors) by administrators when making such changes. N.B., Statically-defined topologies (i.e., those deployed directly by a regular topology XML file) do NOT benefit from this monitoring support. More details are available in the User Guide.
... View more
- Find more articles tagged with:
- ambari-service
- apach
- Dynamic
- FAQ
- Knox
- knox-gateway
- Security
Labels:
12-15-2017
03:58 PM
Change 'sandbox' to whatever the image name is in your local image repository. Alternatively, you can specify the container ID.
... View more
12-13-2017
10:59 PM
4 Kudos
Introduction The 0.14.0 release of Apache Knox includes the ability to dynamically determine topology endpoints for Hadoop services in Ambari-managed clusters. Prior to this release, users had to determine each of these endpoint URLs by navigating the Ambari UI (or combing through the various cluster configuration files), and explicitly add them to their topology descriptors; there was a lot of potential for human error. Support for a new, simplified topology descriptor has been added to leverage this dynamic endpoint discovery and facilitate provider configuration sharing across topologies. This is a dramatic improvement in the usability of Knox. Simplified Descriptors Simplified descriptors are a means to facilitate provider configuration sharing and service endpoint discovery. Rather than editing an XML topology descriptor, it’s now possible to create a simpler descriptor that declaratively specifies the desired contents of a topology, which will ultimately yield a full topology descriptor and corresponding deployment. These simplified descriptors allow service URLs to be specified explicitly, just as full topology descriptors do. However, if URLs are omitted for a service, Knox will attempt to discover that service’s URLs from the Hadoop cluster. Currently, this behavior is only supported for clusters managed by Ambari. Descriptor Properties Property Description discovery-address The endpoint address for the discovery source discovery-type The discovery source type. (Currently, the only supported type is AMBARI) discovery-user The username with permission to access the discovery source. If omitted, then Knox will check for an alias named ambari.discovery.user, and use its value if defined. discovery-pwd-alias The alias of the password for the user with permission to access the discovery source. If omitted, then Knox will check for an alias named ambari.discovery.password, and use its value if defined. provider-config-ref A reference to a provider configuration in {GATEWAY_HOME}/conf/shared-providers/ cluster The name of the cluster from which the topology service endpoints should be determined services The collection of services to be included in the topology File Formats Two file formats are supported for two distinct purposes: Format Purpose YAML intended for the individual hand-editing a simplified descriptor (because of its readability and support for comments) JSON intended to be used for API interaction YAML Example (based on the HDP Docker Sandbox) ---
discovery-address : http://sandbox.hortonworks.com:8080
discovery-user : maria_dev
discovery-pwd-alias : ambari.discovery.password
provider-config-ref : sandbox-providers
cluster: Sandbox
services:
- name: NAMENODE
- name: JOBTRACKER
- name: WEBHDFS
- name: WEBHCAT
- name: OOZIE
- name: WEBHBASE
- name: HIVE
- name: RESOURCEMANAGER A Note About Aliases This example illustrates the specification of credentials for the interaction with Ambari. If no credentials are specified, then the default aliases are queried. Use of the default aliases is sufficient for scenarios where topology discovery will only interact with a single Ambari instance. For multiple Ambari instances however, each will most likely require a different set of credentials. The discovery-user and discovery-pwd-alias properties exist for this purpose. Whether using the default credential aliases or specifying a custom password alias, these aliases must be defined prior to any attempt to deploy a topology using a simplified descriptor. Externalized Provider Configurations Sometimes, the same provider configuration is applied to multiple Knox topologies. Unlike XML topology descriptors, simplified descriptors do not contain provider configuration; rather, they contain references to external provider configuration. With the provider configuration externalized from the simple descriptors, a single configuration can be applied to multiple topologies. This helps reduce the duplication of configuration, and the need to update multiple configuration files when a policy change is required. Updating a provider configuration triggers an update to all those topologies that reference it. The contents of externalized provider configuration is identical to the gateway element from a full topology descriptor. The only difference is that it’s defined in its own XML file in {GATEWAY_HOME}/conf/shared-providers/. Monitored Directories Effecting topology changes is as simple as modifying files in two specific directories. The {GATEWAY_HOME}/conf/shared-providers/ directory is the location where Knox looks for provider configurations. This directory is monitored for changes, such that modifying a provider configuration file therein will trigger updates to any referencing simplified descriptors in the {GATEWAY_HOME}/conf/descriptors/ directory. Care should be taken when deleting these files if there are referencing descriptors; any subsequent modifications of referencing descriptors will fail when the deleted provider configuration cannot be found. The references should all be modified before deleting the provider configuration. Likewise, the {GATEWAY_HOME}/conf/descriptors/ directory is monitored for changes, such that adding or modifying a simplified descriptor file in this directory will trigger the generation and deployment of a topology. Deleting a descriptor from this directory will conversely result in the undeployment of the previously-generated topology. Generated Topologies Generated topology XML descriptors include an element to indicate the fact that they've been generated. <generated>true</generated> These generated topology XML files should not be modified directly. Any changes that are made could potentially be overwritten as a result of a change to the source descriptor, a change to the cluster configuration, or a gateway restart. While deleting a generated topology file will result in an undeployment of that topology, any of the aforementioned changes could result in the regeneration and deployment of that topology. The Admin API and Admin UI disallow modifications to generated topologies. The Admin API does provide the ability to modify simple descriptors and provider configurations, and the Admin UI will provide a similar capability in the future. The only reliable means to modify generated topologies is through changes to their respective source descriptors and provider configurations, either directly on the gateway host or using the Admin API. Admin API The Admin API has been augmented to support the management of provider configuration and simplified descriptor resources. Get a list of the current provider configurations deployed to the gateway: /gateway/admin/api/v1/providerconfig Get/Put/Delete the provider configuration identified by {id}: /gateway/admin/api/v1/providerconfig/{id} Get a list of the current descriptors deployed to the gateway: /gateway/admin/api/v1/descriptors Get/Put/Delete the descriptor identified by {id}: /gateway/admin/api/v1/descriptors/{id} For more complete API details, see the Admin API section of the user guide Try It! 0. Install the HDP Sandbox (https://hortonworks.com/downloads/#sandbox) 1. Create the discovery aliases {GATEWAY_HOME}/bin/knoxcli.sh create-alias ambari.discovery.user --value maria_dev
{GATEWAY_HOME}/bin/knoxcli.sh create-alias ambari.discovery.password --value maria_dev 2. Start the demo LDAP server and the gateway {GATEWAY_HOME}/bin/ldap.sh start
{GATEWAY_HOME}/bin/gateway.sh start 3. Create/copy a provider config to the {GATEWAY_HOME}/conf/shared-providers/ directory Sample sandbox-providers.xml <gateway>
<provider>
<role>authentication</role>
<name>ShiroProvider</name>
<enabled>true</enabled>
<param>
<name>sessionTimeout</name>
<value>30</value>
</param>
<param>
<name>main.ldapRealm</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapRealm</value>
</param>
<param>
<name>main.ldapContextFactory</name>
<value>org.apache.hadoop.gateway.shirorealm.KnoxLdapContextFactory</value>
</param>
<param>
<name>main.ldapRealm.contextFactory</name>
<value>$ldapContextFactory</value>
</param>
<param>
<name>main.ldapRealm.userDnTemplate</name>
<value>uid={0},ou=people,dc=hadoop,dc=apache,dc=org</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.url</name>
<value>ldap://localhost:33389</value>
</param>
<param>
<name>main.ldapRealm.contextFactory.authenticationMechanism</name>
<value>simple</value>
</param>
<param>
<name>urls./**</name>
<value>authcBasic</value>
</param>
</provider>
</gateway> 4. Create/copy a simple descriptor to the descriptors directory (you can use the YAML sample presented earlier in this article) cp simple-sandbox_y.yml {GATEWAY_HOME}/conf/descriptors/ 5. Verify {GATEWAY_HOME}/logs/gateway.log and the contents of the {GATEWAY_HOME}/conf/topologies directory. There should be a file named simple-sandbox_y.xml in the topologies directory. 6. Test the deployed topology by invoking a request to a proxied Hadoop service curl -iku guest:guest-password 'https://localhost:8443/gateway/simple-sandbox_y/webhdfs/v1/?op=LISTSTATUS' 7. Modify the provider config touch {GATEWAY_HOME}/conf/shared-providers/sandbox-providers.xml 8. Test (check the timestamps of {GATEWAY_HOME}/conf/descriptors/simple-sandbox_y.yml and {GATEWAY_HOME}/conf/topologies/simple-sandbox_y.xml, and {GATEWAY_HOME}/logs/gateway.log to verify the regeneration and redeployment of the topology) 9. Modify the descriptor touch {GATEWAY_HOME}/conf/descriptors/simple-sandbox_y.yml 10. Check the timestamp of {GATEWAY_HOME}/conf/topologies/simple-sandbox_y.xml, and {GATEWAY_HOME}/logs/gateway.log to verify the regeneration and redeployment of the topology. 11. Delete the simple descriptor from {GATEWAY_HOME}/conf/descriptors (Verify the removal of {GATEWAY_HOME}/conf/topologies/simple-sandbox_y.xml, and check {GATEWAY_HOME}/logs/gateway.log to verify undeployment of the topology) 12. Repeat steps 3-11 using the Admin API instead of filesystem copies a. Deploy the provider configuration to the gateway: curl -iku admin:admin-password https://localhost:8443/gateway/admin/api/v1/providerconfig/sandbox-providers -X PUT -H Content-Type:application/xml -d "@sandbox-providers.xml" b. The API requires descriptors to be in the JSON format: simple-sandbox_j.json {
"discovery-address":"http://localhost:8080",
"provider-config-ref":"sandbox-providers",
"cluster":"Sandbox",
"services":[
{"name":"NAMENODE"},
{"name":"JOBTRACKER"},
{"name":"WEBHDFS"},
{"name":"WEBHCAT"},
{"name":"OOZIE"},
{"name":"WEBHBASE"},
{"name":"RESOURCEMANAGER"}
]
} Deploy the JSON descriptor to the gateway: curl -iku admin:admin-password https://localhost:8443/gateway/admin/api/v1/descriptors/simple-sandbox -X PUT -H Content-Type:application/json -d "@simple-sandbox_j.json" c. Test the resulting deployed topology curl -iku guest:guest-password 'https://localhost:8443/gateway/simple-sandbox/webhdfs/v1/?op=LISTSTATUS' d. Try to delete the provider configuration (It should be disallowed because simple-sandbox_j.json references it): curl -iku admin:admin-password 'https://localhost:8443/gateway/admin/api/v1/providerconfig/sandbox-providers' -X DELETE e. Delete the referencing descriptor: curl -iku admin:admin-password 'https://localhost:8443/gateway/admin/api/v1/descriptors/simple-sandbox' -X DELETE f. Then, try to delete the provider configuration again. It should succeed this time because there are no referencing descriptors. Summary Hopefully, the benefits of this new functionality are clear. Defining and deploying topologies for Ambari-managed Hadoop clusters is now easier and less error-prone. Provider configurations can now be shared by multiple topologies, reducing duplicate configuration and the associated potential for errors in managing changes. There are related UI enhancements coming soon, which will further ease the management of topologies, and continue the enhancement of Knox's usability. Check out the User Guide for more details about these additions.
... View more
- Find more articles tagged with:
- automation
- configuration
- Dynamic
- How-ToTutorial
- Knox
- knox-gateway
- Security
Labels:
10-06-2017
01:56 AM
This was a big help to me in getting the Docker sandbox up and running. The Sandbox doc says that a bunch of services are started automatically, but alas, they are not. Once I read your instructions for starting the service processes, the mystery was solved. Thanks!
... View more
10-05-2017
08:35 PM
Did you try port 443, per the page you referenced?
... View more
09-29-2017
12:13 AM
@bob bza You may also find the information you see in this article: Using Ambari to Automate Knox Topology Definitions. Specifically, while the following API will get you all the host component details, including the host(s), api/v1/clusters/clustername/services/YARN/components/RESOURCEMANAGER it will not get you the appropriate ports. For that, you need to interrogate the active YARN service configuration: api/v1/clusters/clustername/configurations/service_config_versions?service_name=YARN And look for the yarn-site configuration type, and the following properties therein:
yarn.http.policy This will tell you if http is sufficient, or if you need to use https yarn.resourcemanager.address The value of this property is the HOST:PORT address you originally requested. I hope that helps...
... View more
09-22-2017
02:53 AM
Is LDAP configured? If using the demo LDAP server, did you start it (bin/ldap.sh start) prior to starting the gateway?
... View more
09-21-2017
01:07 PM
If you're using the Knox demo LDAP server, then try guest/guest-password. If this is a production installation, then it will depend on your installation.
... View more
09-21-2017
01:01 PM
Are you asking about the URL for the Knox admin UI? If so, try https://hostname:8443/gateway/manager/admin-ui/
... View more
09-20-2017
06:31 PM
gateway-path is defined in conf/gateway-site.xml (gateway.path property), and cluster-name is actually the name of the Knox topology (name of the topology.xml file). The default knox.path property value is gateway, and if the topology file is named mytopology.xml, the cluster-name is mytopology. So, putting them together: http://hostname:8443/gateway/mytopology/webhdfs/v1/?op=LISTSTATUS
... View more
09-13-2017
06:29 PM
4 Kudos
Introduction & Motivation
Apache Knox acts as a proxy between Hadoop services and their consumers. Knox topology files define the mapping between requested services and their respective endpoints.
To make Knox aware of new mappings, a Knox administrator has to determine the endpoints for every service to be proxied, and add them to a topology file (XML). When working with an Ambari-managed cluster, the Knox administrator has the advantage of Ambari's knowledge of the cluster details. However, navigating the Ambari UI to locate the correct pieces of information to assemble these service endpoint URLs can be challenging, and having a human type them into an XML file introduces potential for errors. Fortunately, Ambari's API can be leveraged to automate the determination of these service endpoint URLs and, ultimately, the generation of Knox topology files. Knox Topology Files
A topology file has to include a service entry for every Hadoop service to be proxied.
Listing 1: Example topology XML file
<topology>
<gateway><!-- Provider Configurations --></gateway>
<!-- Service Endpoint Mappings -->
<service>
<role>NAMENODE</role>
<url>hdfs://c6401.ambari.apache.org:8020</url>
</service>
<service>
<role>JOBTRACKER</role>
<url>rpc://c6402.ambari.apache.org:8050</url>
</service>
<service>
<role>WEBHDFS</role>
<url>http://c6401.ambari.apache.org:50070/webhdfs</url>
</service>
<service>
<role>WEBHCAT</role>
<url>http://c6402.ambari.apache.org:50111/templeton</url>
</service>
<service>
<role>OOZIE</role>
<url>http://c6402.ambari.apache.org:11000/oozie</url>
</service>
<service>
<role>WEBHBASE</role>
<url>http://c6401.ambari.apache.org:60080</url>
</service>
<service>
<role>HIVE</role>
<url>http://c6402.ambari.apache.org:10001/cliservice</url>
</service>
<service>
<role>RESOURCEMANAGER</role>
<url>http://c6402.ambari.apache.org:8088/ws</url>
</service>
<service>
<role>AMBARI</role>
<url>http://c6401.ambari.apache.org:8080</url>
</service>
</topology>
To populate the
WEBHDFS role as it is in Listing 1, you have to login to the Ambari console, navigate to the HDFS configurations, find the Advanced hdfs-site configuration, and look for the dfs.namenode.http-address property value. If you do this, you'll also notice that there is another property named dfs.namenode.https-address. It would be quite easy for a person to grab the value of the latter property rather than the former. Furthermore, this configuration does not tell you anything about the /webhdfs path required for this service. It may also be important to note the value of the General configuration's dfs.webhdfs.enabled property.
Similarly, for the
JOBTRACKER role, you have to know to navigate to the YARN configurations, and check the Advanced yarn-site configuration to get the yarn.resourcemanager.address property value; make sure you don't grab the adjacent yarn.resourcemanager.admin.address by mistake. And remember that its URL scheme needs to be rpc.
Hopefully, you can see that correctly populating these service endpoint URLs is not a simple task, especially for individuals who are new to Hadoop, Ambari and/or Knox. HA deployments further increase this difficulty. Automated Topology Generation With the Ambari API REST API
The
Ambari REST API provides the ability to programmatically determine the disparate pieces of information necessary for assembling service endpoint URLs. Software can be written to correctly construct each of these service endpoint URLs from the correct properties, eliminating the potential for human error in identifying the appropriate properties and attempting to construct the corresponding URLs.
There are two especially interesting resources available from the Ambari API: 1. /api/v1/clusters/CLUSTER_NAME/services?fields=components/host_components/HostRoles
This resource describes the service component host mappings for a cluster, which is useful for determining the host address for some service components.
(Note: CLUSTER_NAME is a placeholder for an actual cluster name; here and throughout)
Listing 2: Excerpt of service host components API response
“items” : [
{
“href” : "http://AMBARI_ADDRESS/api/v1/CLUSTER_NAME/CLUSTER_NAME/services/HIVE",
“components” : [
{
“ServiceComponentInfo” : { },
“host_components” : [
{
“HostRoles” : {
“cluster_name” : “CLUSTER_NAME”,
“component_name” : “HCAT”,
“host_name” : “c6402.ambari.apache.org”
}
}
]
},
{
“ServiceComponentInfo” : { },
“host_components” : [
{
“HostRoles” : {
“cluster_name” : “CLUSTER_NAME”,
“component_name” : “HIVE_SERVER”,
“host_name” : “c6402.ambari.apache.org”
}
}
]
}
]
},
{
“href” : "http://AMBARI_ADDRESS/api/v1/CLUSTER_NAME/CLUSTER_NAME/services/HDFS",
“ServiceInfo” : {},
“components” : [
{
“ServiceComponentInfo” : { },
“host_components” : [
{
“HostRoles” : {
“cluster_name” : “CLUSTER_NAME”,
“component_name” : “NAMENODE”,
“host_name” : “c6401.ambari.apache.org”
}
}
]
}
]
}
]
In
Listing 2, we see that the HIVE_SERVER host is c6402.ambari.apache.org, and that the NAMENODE host is c6401.ambari.apache.org. An actual response would be more complete in terms of the number of mappings. 2. /api/v1/clusters/ CLUSTER_NAME /configurations/service_config_versions?is_current=true
This resource lists a cluster's active service configuration versions and their respective contents.
Listing 3: Excerpt of service_config_versions API response
"items" : [
{
"href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=HDFS&service_config_version=2",
"cluster_name" : "CLUSTER_NAME",
"configurations" : [
{
"Config" : {
"cluster_name" : "CLUSTER_NAME",
"stack_id" : "HDP-2.6"
},
"type" : "ssl-server",
"properties" : {
"ssl.server.keystore.location" : "/etc/security/serverKeys/keystore.jks",
"ssl.server.keystore.password" : "SECRET:ssl-server:1:ssl.server.keystore.password",
"ssl.server.keystore.type" : "jks",
"ssl.server.truststore.location" : "/etc/security/serverKeys/all.jks",
"ssl.server.truststore.password" : "SECRET:ssl-server:1:ssl.server.truststore.password"
},
"properties_attributes" : { }
},
{
"Config" : {
"cluster_name" : "CLUSTER_NAME",
"stack_id" : "HDP-2.6"
},
"type" : "hdfs-site",
"tag" : "version1",
"version" : 1,
"properties" : {
"dfs.cluster.administrators" : " hdfs",
"dfs.encrypt.data.transfer.cipher.suites" : "AES/CTR/NoPadding",
"dfs.hosts.exclude" : "/etc/hadoop/conf/dfs.exclude",
"dfs.http.policy" : "HTTP_ONLY",
"dfs.https.port" : "50470",
"dfs.journalnode.http-address" : "0.0.0.0:8480",
"dfs.journalnode.https-address" : "0.0.0.0:8481",
"dfs.namenode.http-address" : "c6401.ambari.apache.org:50070",
"dfs.namenode.https-address" : "c6401.ambari.apache.org:50470",
"dfs.namenode.rpc-address" : "c6401.ambari.apache.org:8020",
"dfs.namenode.secondary.http-address" : "c6402.ambari.apache.org:50090",
"dfs.webhdfs.enabled" : "true"
},
"properties_attributes" : {
"final" : {
"dfs.webhdfs.enabled" : "true",
"dfs.namenode.http-address" : "true",
"dfs.support.append" : "true",
"dfs.namenode.name.dir" : "true",
"dfs.datanode.failed.volumes.tolerated" : "true",
"dfs.datanode.data.dir" : "true"
}
}
}
]
},
{
"href" : "http://AMBARI_ADDRESS/api/v1/clusters/CLUSTER_NAME/configurations/service_config_versions?service_name=YARN&service_config_version=1",
"cluster_name" : "CLUSTER_NAME",
"configurations" : [
{
"Config" : {
"cluster_name" : "CLUSTER_NAME",
"stack_id" : "HDP-2.6"
},
"type" : "yarn-site",
"properties" : {
"yarn.http.policy" : "HTTP_ONLY",
"yarn.log.server.url" : "http://c6402.ambari.apache.org:19888/jobhistory/logs",
"yarn.log.server.web-service.url" : "http://c6402.ambari.apache.org:8188/ws/v1/applicationhistory",
"yarn.nodemanager.address" : "0.0.0.0:45454",
"yarn.resourcemanager.address" : "c6402.ambari.apache.org:8050",
"yarn.resourcemanager.admin.address" : "c6402.ambari.apache.org:8141",
"yarn.resourcemanager.ha.enabled" : "false",
"yarn.resourcemanager.hostname" : "c6402.ambari.apache.org",
"yarn.resourcemanager.webapp.address" : "c6402.ambari.apache.org:8088",
"yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled" : "false",
"yarn.resourcemanager.webapp.https.address" : "c6402.ambari.apache.org:8090",
},
"properties_attributes" : { }
},
]
},
]
In
Listing 3, we see the ssl-server, hdfs-site, and yarn-site configurations, which are the active versions for the cluster at the time the resource was accessed. Again, an actual response would be more complete, including all of the active configurations for the cluster. Topology Generation
Furthermore, a tool that subsequently generates Knox topology files based on the cluster configuration information would deal with the errors associated with copying endpoint URLs into the XML manually.
I've created a rudimentary implementation of such a tool to demonstrate it's usefulness. It's a small set of Python scripts, which can be found at
https://github.com/pzampino/knox-topology-gen/
This tool accepts simple YAML descriptors as input toward generating proper Knox topology files.
Listing 4: demo.yml
---
# Discovery info source
discovery-registry : http://c6401.ambari.apache.org:8080
# Provider config reference, the contents of which will be
# included in the resulting topology descriptor.
# The contents of this reference has a <gateway/> root, and
# contains <provider/> configurations.
provider-config-ref : ambari-cluster-policy.xml
# The cluster for which the service details should be discovered
cluster: mycluster
# The services to declare in the resulting topology descriptor,
# whose URLs will be discovered (unless a value is specified)
services:
- name: NAMENODE
- name: JOBTRACKER
- name: WEBHDFS
- name: WEBHCAT
- name: OOZIE
- name: WEBHBASE
- name: HIVE
- name: RESOURCEMANAGER
- name: AMBARI
url: http://c6401.ambari.apache.org:8080
- name: AMBARIUI
url: http://c6401.ambari.apache.org:8080
The
discovery-registry is simply the Ambari API address for the instance managing the cluster for which the endpoint URLs are desired.
The
provider-config-ref entry points to ambari-cluster-policy.xml; This referenced file provides the <gateway/> element content that is found in a topology file, and the tool copies this referenced content directly into its XML output. A nice side-effect of using provider configuration references is that these configurations can be more easily shared among Knox topologies.
It is necessary to specify a
cluster because a Knox topology is a mapping for a specific cluster.
You'll also notice that some of the
services entries have an associated url, while others do not. This is simply a way to tell the tool that it doesn't need to discover these URLs. The endpoint URLs for those services for which there is no url specified will be populated from the details discovered from Ambari. Running the tool
A very simple script is provided for testing the TopologyBuilder python object. , and it's output looks like the following:
knox-topology-gen pzampino$ python test_build_topology.py
NAMENODE : hdfs://c6401.ambari.apache.org:8020
JOBTRACKER : rpc://c6402.ambari.apache.org:8050
WEBHDFS : http://c6401.ambari.apache.org:50070/webhdfs
WEBHCAT : http://c6402.ambari.apache.org:50111/templeton
OOZIE : http://c6402.ambari.apache.org:11000/oozie
WEBHBASE : http://c6401.ambari.apache.org:60080
HIVE : http://c6402.ambari.apache.org:10001/cliservice
RESOURCEMANAGER : http://c6402.ambari.apache.org:8088/ws
AMBARI : http://c6401.ambari.apache.org:8080
AMBARIUI : http://c6401.ambari.apache.org:8080
Generated demo.xml
Note that
demo.xml is a complete Knox topology file, which can be copied to the $GATEWAY_HOME/conf/topologies directory for deployment.
Check out the source for this tool for more details. Hopefully, this information is a help to those who need to define Knox topologies. Update: Support for this functionality has been added directly to Apache Knox, as of the 0.14.0 release. Check out the associated Apache Knox Dynamic Service Endpoint Discovery article for details.
... View more
- Find more articles tagged with:
- ambari-api
- automate
- automation
- How-ToTutorial
- Knox
- knox-gateway
- Sandbox & Learning
- topology
Labels: