Member since
08-13-2019
37
Posts
26
Kudos Received
6
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5593 | 12-31-2018 08:44 AM | |
1692 | 12-18-2018 08:39 PM | |
1352 | 08-27-2018 11:29 AM | |
3500 | 10-12-2017 08:35 PM | |
2360 | 08-06-2017 02:57 PM |
10-12-2017
08:35 PM
@Srikanth Gorripati
On infrastructure side you need to have the following in place:
A running HBase cluster
At least one HBase REST server up and running. If you have more you can configure Knox in HA mode
Knox configured to point to the HBase REST server url and port(s).
You can get some help troubleshooting Knox here.
On application side you need to:
Use a library that can perform requests to HTTP servers. (There's many and you probably have a favourite one?)
Direct your call to the Knox url:port endpoint using the path of the defined topology and the HBase service /gateway/default/hbase
Use the correct path of the HBase REST API defined in the HBase book
... View more
08-06-2017
02:57 PM
1 Kudo
Hi @Rishabh Oberoi The Kerberos principal and the OS user don't have much in common. Each OS user can authenticate as multiple Kerberos principals. The Kerberos principal is stored in a file called the "ticket cache". You can see which principal you are at the moment using the "klist" command. Just type "klist". In this example I am authenticated as "jimmy.page" in the Kerberos REALM "FIELD.HORTONWORKS.COM". $ klist
Ticket cache: FILE:/tmp/krb5cc_1960402946
Default principal: jimmy.page@FIELD.HORTONWORKS.COM
Valid starting Expires Service principal
08/06/2017 14:47:12 08/07/2017 00:47:12 krbtgt/FIELD.HORTONWORKS.COM@FIELD.HORTONWORKS.COM
renew until 08/13/2017 14:47:12
Without kinit you shouldn't have a ticket in the ticket cache and therfore see something like $ klist
klist: No credentials cache found (filename: /tmp/krb5cc_1960402946) Before you do any "hadoop" or "hdfs" commands you should check with klist if you are authenticated and if you are authenticated as the user you want to be. Thus, independently from which OS user you are, you can authenticate as the hduser by simply doing kinit hduser You will be promted for the password of the hduser. Now you should be able to use HDFS as hduser. Note 1: Be prepared, that you will not have any permissions to create directories or write data unless you give these permission using the HDFS internal POSIX system or setting a corresponding policy in Apache Ranger. Note 2: If you use keytabs instead of passwords (and for the sake of clarity) it makes sense to create an OS user AND a Kerberos principal with the same name and give the OS user permissions on the keytab to that user only.
... View more
07-13-2017
10:27 PM
4 Kudos
This article is based on one of my blog posts. It is specifically about how to troubleshoot and debug an application behind Knox and ultimately get it up and running. Start Small
First try to access the service directly before you go over Knox. In many cases, there’s nothing wrong with your Knox setup, but with either the way you setup and configured the service behind Knox or the way you try to access that service.
When you are familiar on how to access your service directly and when you have verified that it works as intended, try to do the same call on Knox. Example:
You want to check if webhdfs is reachable so you first verify directly at the service and try to get the home directory of the service.
curl --negotiate -u : http://webhdfs-host.field.hortonworks.com:50070/webhdfs/v1/?op=GETHOMEDIRECTORY
If above request gives a valid 200 response and a meaningful answer you can safely check your Knox setup.
curl -k -u myUsername:myPassword https://knox-host.field.hortonworks.com:8443/gateway/default/webhdfs/v1/?op=GETHOMEDIRECTORY
Note: Direct access of WebHDFS and access of WebHDFS over Knox use two different authentication mechanisms: The first one uses SPNEGO which requires a valid Kerberos TGT in a secure cluster, if you don’t want to receive a “401 – Unauthorized” response. The latter one uses HTTP basic authentication against an LDAP, which is why you need to provide username and password on the command line.
Note 2: For the sake of completeness, I mention that here: Obviously, you direct the first request directly to the service host and port, while you direct your second request to the Knox host and port and specify which service.
The next section answers the question, what to do if the second command fails? (If the first command fails, go setup your service correctly and return later). Security Related Issues
So what do the HTTP response codes mean for a Knox application? Where to start?
Very common are “401 – Unauthorized”. This can be misleading, since 401 is always tied to authentication – not authorization. That means you need to probably check one of the following items. Which of these items causes the error can be found in the knox log (per default /var/log/knox/gateway.log )
Is your username password combination correct (LDAP)?
Is your username password combination in the LDAP you used?
Is your LDAP server running?
Is your LDAP configuration in the Knox topology correct (hostname, port, binduser, binduser password,…)?
Is your LDAP controller accessible through the firewall (ports 389 or 636 open from the Knox host)?
Note: Currently (in HDP 2.6), you can specify an alias for the binduser password. Make sure, that this alias is all lowercase. Otherwise you will get a 401 response as well.
If you got past the 401s, a popular response code is “403 – Unauthorized”. Now this has actually really something to do with authorization. Depending on if you use ACL authorization or Ranger Authorization (which is recommended) you go ahead differently. If you use ACLs, make sure that the user/group is authorized in your topology definition. If you use Ranger, check the Ranger audit log dashboard and you will immediately notice two possible error sources:
Your user/group is not allowed to use Knox.
Your user/group is not allowed to use the service that you want to access behind Knox.
Well, we came a long way and with respect to security we are almost done. One possible problem you could become is with impersonation. You need knox to be allowed to impersonate any user who access a service with knox. This is a configuration in core-site.xml: hadoop.proxyuser.knox.groups and hadoop.proxyuser.knox.hosts . Enter a comma separated list of groups and hosts that should be able to access a service over knox or set a wildcard * .
This is what you get in the Knox log, when your Ranger Admin server is not running and policies cannot be refreshed.
2017-07-05 21:11:53,700 ERROR util.PolicyRefresher (PolicyRefresher.java:loadPolicyfromPolicyAdmin(288)) - PolicyRefresher(serviceName=condlahdp_knox): failed to refresh policies. Will continue to use last known version of policies (3)
javax.ws.rs.ProcessingException: java.net.ConnectException: Connection refused (Connection refused)
This is also a nice example of Ranger’s design to not interfere with services if it’s down: policies will not be refreshed, but are still able operate as intended with the set of policies before Ranger crashed. Application Specific Issues
Once you are past the authentication and authorization issues, there might be issues with how Knox interacts with its applications. This section might grow with time. If you have more examples of application specific issues, leave a comment or send me an email. Hive:
To enable Hive working with Knox, you need to change the transport mode from binary to http. It might be necessary in rare cases to not only restart Hiveserver2 after this configuration change, but also the Knox gateway.
This is what you get when you don’t switch the transport mode from “binary” to “http”. Binary runs on port 10000, http runs on port 10001. When binary transport mode is still active Knox will try to connect to port 10001 which is not available and thus fails with “Connection refused”.
2017-07-05 08:24:31,508 WARN hadoop.gateway (DefaultDispatch.java:executeOutboundRequest(146)) - Connection exception dispatching request: http://condla0.field.hortonworks.com:10001/cliservice?doAs=user org.apache.http.conn.HttpHostConnectException: Connect to condla0.field.hortonworks.com:10001 [condla0.field.hortonworks.com/172.26.201.30] failed: Connection refused (Connection refused)
org.apache.http.conn.HttpHostConnectException: Connect to condla0.field.hortonworks.com:10001 [condla0.field.hortonworks.com/172.26.201.30] failed: Connection refused (Connection refused)
at org.apache.http.impl.conn.DefaultHttpClientConnectionOperator.connect(DefaultHttpClientConnectionOperator.java:151)
at org.apache.http.impl.conn.PoolingHttpClientConnectionManager.connect(PoolingHttpClientConnectionManager.java:353)
When you fixed all possible HTTP 401 errors for other services than Hive, but still get on in Hive, you might forget to pass username and password to beeline beeline -u "<jdbc-connection-string>" -n <username> -p <password> The correct jdbc-connection-string should have a format as in the example below: jdbc:hive2://$KNOX_HOSTNAME:$KNOX_PORT/default;ssl=true;sslTrustStore=$TRUSTSTORE_PATH;trustStorePassword=$TRUSTSTORE_SECRET;transportMode=http;httpPath=gateway/default/hive $TRUSTSTORE_PATH is the path to the truststore containing the knox server certificate, on the server with root access you could e.g. use /usr/hdp/current/knox-server/data/security/keystores/gateway.jks $KNOX_HOSTNAME is the hostname where the Knox instance is running $KNOX_PORT is the port exposed by Knox $TRUSTSTORE_SECRET is the secret you are using for your truststore
Now, this is what you get, when you connect via beeline trying to talk to Knox from a different (e.g. internal) hostname than the one configured in the ssl certificate of the server. Just change the hostname and everything will work fine. While this error is not specifically Hive related, you will most of the time encounter it in combination with Hive, since most of the other services don’t require you to check your certificates. Connecting to jdbc:hive2://knoxserver-internal.field.hortonworks.com:8443/;ssl=true;sslTrustStore=truststore.jks;trustStorePassword=myPassword;transportMode=http;httpPath=gateway/default/hive
17/07/06 12:13:37 [main]: ERROR jdbc.HiveConnection: Error opening session
org.apache.thrift.transport.TTransportException: javax.net.ssl.SSLPeerUnverifiedException: Host name 'knoxserver-internal.field.hortonworks.com' does not match the certificate subject provided by the peer (CN=knoxserver.field.hortonworks.com, OU=Test, O=Hadoop, L=Test, ST=Test, C=US)
HBase:
WEBHBASE is the service in a Knox topology to access HBase via the HBase REST server. Of course, a prerequisite is that the HBase REST server is up and running.
Even if it is up and running it can occur that you receive an Error with HTTP code 503. 503: Unavailable. This is not related to Knox. You can track down the issue to a HBase REST server related issue, in which the authenticated user does not have privileges to e.g. scan the data. Give the user the correct permissions to solve this error.
... View more
Labels:
06-22-2017
11:28 AM
Thanks @Jay SenSharma
... View more
06-22-2017
10:30 AM
@Jay SenSharma thanks. Yeah... forget that I upgraded yesterday. I have ambari-agent and ambari-server 2.5.1. ambari-infra-solr, ambari-metrics-collector,.... are still 2.5.0.3 As a workaround I uncommented the piece of code in the master.py file. Is there another possible solution?
... View more
06-22-2017
08:30 AM
Trying to start Zeppelin server from Ambari. This worked fine until once, when Ambari fails to start it with: Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/common-services/ZEPPELIN/0.6.0.2.5/package/scripts/master.py", line 450, in <module>
Master().execute()
File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 329, in execute
method(env)
File "/var/lib/ambari-agent/cache/common-services/ZEPPELIN/0.6.0.2.5/package/scripts/master.py", line 227, in start
self.update_kerberos_properties()
File "/var/lib/ambari-agent/cache/common-services/ZEPPELIN/0.6.0.2.5/package/scripts/master.py", line 302, in update_kerberos_properties
and params.zookeeper_znode_parent not in interpreter['properties']['phoenix.url']:
KeyError: 'phoenix.url'
Didn't change any configs. Restart of Ambari server/agents does not work Ambari 2.5.0, HDP 2.6.0.3
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache Zeppelin
05-30-2017
08:25 PM
To answer your question regarding Zookeeper. HBase needs Zookeeper. If you didn't set up Zookeeper yourself, HBase spins up an "internal" Zookeeper server, which is great for testing, but shouldn't be used in production scenarios.
... View more
05-30-2017
07:52 PM
@Sebastien Chausson Cool, glad to see that you got it up and running yourself! If my answer was helpful you can vote it up or mark it as best answer. 🙂
... View more
05-29-2017
05:56 AM
2 Kudos
Hi,
regarding your first bunch of questions: The answer depends on which distribution and versions you use or if you are using vanilla HBase. When you, e.g., install HDP 2.4, here is a guide to start the thrift server: https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.4.3/bk_installing_manually_book/content/ref-2a6efe32-d0e1-4e84-9068-4361b8c36dc8.1.html
regarding your last question: the error message indicates, that you don't have the thrift module installed, that you will need on the client side to execute your python program.
Depending on how you manage packages, e.g., using pip you would need to install the thrift module: pip install thrift Doing so, at least this error message will disappear.
... View more
05-17-2017
06:52 AM
Thanks Mayank. Just wanted to add one thing to clarify for others who might have this problem, because I was wasting some time on this myself: To solve the SSLContext must not be null error, you correctly stated "distribute keystore and truststore file to all machines". I happened to only distribute them to all HBase Master nodes, but it's important to also deploy the same keystores to all region server machines.
... View more
- « Previous
-
- 1
- 2
- Next »