Member since
09-27-2016
73
Posts
9
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1175 | 09-15-2017 01:37 PM | |
2181 | 09-14-2017 10:08 AM |
09-11-2017
03:06 PM
1 Kudo
Hi,
I'm trying to run a spark job for which all executors have to call a secured (HTTPS) web service on a dedicated server. During SSL handshake, this server returns a certificate that has been signed by a private (company specific) CA. The certificate of this CA has been added to a custom truststore (cacert) that I would like to point to in spark configuration in order for the executors to validate server's certificates without any extra configuration. I know that I can pass following option to my spark-submit command line : "--conf "spark.executor.extraJavaOptions=-Djavax.net.ssl.trustStore=<MyCaCert> -Djavax.net.ssl.trustStorePassword=<MyPassword>" ...but I would like to avoid asking this to all our users (because they are not supposed to know where this trustore is located and its password). I tried to use the "ssl.client.truststore.location" property as described in https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.5.3/bk_security/content/ch_wire-webhdfs-mr-yarn.html
but it didn't change anything. Obviously spark does not use this configuration ? Do you guys know how is configured the default truststore used by spark executors ? Any help will be highly appreciated 🙂 Thanks
... View more
Labels:
- Labels:
-
Apache Spark
08-09-2017
07:35 AM
1 Kudo
End of the story : In fact, the problem was related to https://issues.apache.org/jira/browse/HADOOP-10786 I moved to hadoop-common 2.6.1 and used AuthUtil class: http://hbase.apache.org/1.2/devapidocs/org/apache/hadoop/hbase/AuthUtil.html And everything started to work fine 🙂 Thanks for your help
... View more
08-07-2017
06:19 AM
Hi Josh, Thanks for your help. Unfortunately, I'm still stuck with this issue, which seems related to hbase only, not a pure kerberos/hadoop problem if I understand properly : I gave a try to a "non-hbase" web service that simply displays the content of an HDFS folder, with the exact same idea (log on the cluster at application startup + background thread that periodically renew), and it works like a charm : I invoke the WS that properly displays the files in the HDFS folder, then I can wait for several days without any other activity on the web application and call it again successfully. Perfect. Then, back to my hbase example : my web service logs in at startup, creates an HBase connection and displays the name of one table. But if I wait more than the ticket lifetime, when I invoke again the web service, I face the previously mentionned warnings. According to your answer, I guess I can ignore the first ones, but the latest one is probably the reason why my web service ends with socket timeout error : 17/08/0116:02:01 WARN ipc.AbstractRpcClient:Couldn't setup connection for myuser@mydomain.com to hbase/myserver.mydomain.com@mydomain.com ...
As you were wondering what would occur next, I waited for a couple of minutes (>10), and got the same warning sequence again and again during this period, leading to a socket timeout error on client side (which is not acceptable...). Finally, I took a look at your last suggestion, but when I try to proceed with 'kinit -R', I face following : kinit: KDC can't fulfill requested option while renewing credentials And my ticket expiration time is not updated by this command...Could it be the root cause of my problem ? Thanks again
... View more
08-03-2017
07:11 AM
Hi,
I'm trying to setup web services that interact with my hadoop/hbase kerberized cluster.
My application is deployed in a tomcat server and I would like to avoid recreating a new HBase connection each and every time I have to access HBase.
Similarly, I want my application to be self sufficient, i.e I dont want to proceed with 'kinit' commands before starting up my tomcat server.
Thus, I would like to implement a utility class in charge of managing login operation on the cluster and connection to hbase, but I'm struggling with kind of "ticket expiration" issues. First time my GetHbaseConnection() method is invoked, it properly connects to the cluster using provided keytab & principal (using UserGroupInformation.loginUserFromKeytab(user, keyTabPath) method), and create a brand new hbase connection (ConnectionFactory.createConnection(conf)) => perfect. By default, obtained ticket has a 10h lifetime (default value from /etc/krb5.conf file), so everything seems to work fine during first 10 hours period. Unfortunately, after this ticket has expired, my code fails with following exception : 17/08/01 07:40:52 http-nio-8443-exec-4 WARN AbstractRpcClient:699 - Exception encountered while connecting to the server : javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
17/08/01 07:40:52 http-nio-8443-exec-4 ERROR AbstractRpcClient:709 - SASL authentication failed. The most likely cause is missing or invalid credentials. Consider 'kinit'. javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]
=> I had to setup a dedicated thread that invoke UserGroupInformation.checkTGTAndReloginFromKeytab() method on a regular basis in order to refresh the ticket.
Anyway, after a long time of inactivity (typically a whole night), when I try to invoke my web service, I can see following warnings in my tomcat logs : 17/08/03 08:25:28 hconnection-0x51b0ea6-shared--pool1-t51 WARN UserGroupInformation:1113 - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
17/08/03 08:25:29 hconnection-0x51b0ea6-shared--pool1-t51 WARN UserGroupInformation:1113 - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
17/08/03 08:25:30 hconnection-0x51b0ea6-shared--pool1-t51 WARN UserGroupInformation:1113 - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
17/08/03 08:25:31 hconnection-0x51b0ea6-shared--pool1-t51 WARN UserGroupInformation:1113 - Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
17/08/03 08:25:35 hconnection-0x51b0ea6-shared--pool1-t51 WARN AbstractRpcClient:695 - Couldn't setup connection for myuser@mydomain.com to hbase/myserver.mydomain.com@mydomain.com
...And my call to the web service finally fails with SocketTimeoutException... To reproduce the issue quickly, I wrote a simple java application (outside of tomcat), removed the code that logs the user in the cluster to delegate this part to an external/manual kinit operation :
Proceed with a 'kinit' operation outside of my java application. This way I am able to get a "short-life" (1 minute) ticket using a custom krb5.conf file : env KRB5_CONFIG=/local/home/myuser/mykrb5.conf kinit -kt /local/home/myuser/myuser.keytab myuser@mydomain.com
Then I execute my java standalone application that displays the name of one table in HBase on a regular basis (every 10 seconds). Note that I create a new HBase connection for every iteration, I dont try to reuse connection at the moment : public static void main(String[] args) throws IOException, InterruptedException {
System.setProperty( "sun.security.krb5.debug", "true");
Configuration configuration = HBaseConfiguration.create();
while (true) {
Connection conn = ConnectionFactory.createConnection(configuration);
Admin admin = conn.getAdmin();
TableName[] tableNames = admin.listTableNames();
System.out.println(tableNames[0].getNameWithNamespaceInclAsString());
Thread.currentThread().sleep(10000);
}
}
During 1 minute, it works perfectly, but then I face endless warnings and my code does not execute properly : 17/08/01 16:01:55 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
17/08/01 16:01:57 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
17/08/01 16:01:59 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
17/08/01 16:02:00 WARN security.UserGroupInformation: Not attempting to re-login since the last re-login was attempted less than 600 seconds before.
17/08/01 16:02:01 WARN ipc.AbstractRpcClient: Couldn't setup connection for myuser@mydomain.com to hbase/myserver.mydomain.com@mydomain.com ...
I dont understand the how kerberos ticket expiration and hbase connection work together, does anyone could help on this topic ? In other words, I would like that my application connects to the cluster when it starts up, and create an hbase connection that I can keep "forever". Is it possible ? What did I miss ? Thanks for your help
... View more
Labels:
- Labels:
-
Apache HBase
07-06-2017
06:25 AM
Hi Josh, You are right, I gave a try using another principal for my client to match the realm of the principal used by PQS and it works fine now... Thanks a lot for your help
... View more
07-05-2017
10:02 AM
I'm trying to use Phoenix Query Server on my kerberized cluster. I tried to connect to it with provided thin client tool without any success : /sqlline-thin.py http://myserver.fqdn:8765 Results with following error : ...
17/07/05 11:49:51 DEBUG auth.HttpAuthenticator: Authentication succeeded
17/07/05 11:49:51 DEBUG conn.DefaultManagedHttpClientConnection: http-outgoing-0: Close connection
17/07/05 11:49:51 DEBUG execchain.MainClientExec: Connection discarded
17/07/05 11:49:51 DEBUG conn.PoolingHttpClientConnectionManager: Connection released: [id: 0][route: {}->http://fr0-datalab-p31.bdata.corp:8765][total kept alive: 0; route allocated: 0 of 25; total allocated: 0 of 100]
java.lang.RuntimeException: Failed to execute HTTP Request, got HTTP/403
at org.apache.calcite.avatica.remote.AvaticaCommonsHttpClientSpnegoImpl.send(AvaticaCommonsHttpClientSpnegoImpl.java:148)
at org.apache.calcite.avatica.remote.RemoteProtobufService._apply(RemoteProtobufService.java:44)
at org.apache.calcite.avatica.remote.ProtobufService.apply(ProtobufService.java:81)
at org.apache.calcite.avatica.remote.Driver.connect(Driver.java:175)
at sqlline.DatabaseConnection.connect(DatabaseConnection.java:157)
at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:203)
However, it ends up with the cli prompt : 0: jdbc:phoenix:thin:url=http://myserver> ... but without any valid connection : As soon as I try to proceed with a "select" statement, I get "No current connection" message. To solve this, I tried to execute !connect myserver.fqdn myuser mypassword
but faced No known driver to handle "myserver.fqdn" It's worth saying that : I performed a successfull "kinit" beforehand. PQS runs fine on myserver.fqdn and listens on 8765 port (default one) Any idea how I could investigate further this issue ? Thanks for your help
... View more
Labels:
- Labels:
-
Apache HBase
-
Apache Phoenix
06-21-2017
12:01 PM
@Aravindan Vijayan Thanks a lot for these helpful pointers. Actually I was not aware about this time boundary, and as soon as I changed timestamps in my json data, it started to work fine !
... View more
06-20-2017
09:48 AM
I finally figured out how to connect to ambari metrics tables with phoenix : by default, sqlline.py points to "main" hbase configuration, not the ams embedded instance. By defining the HBASE_CONF_DIR env variable, got it working : export
HBASE_CONF_DIR=/etc/ambari-metrics-collector/conf /usr/hdp/current/phoenix-client/bin/sqlline.py
fr0-datalab-p09.bdata.corp:61181:/ams-hbase-secure I guess there is something similar when trying to connect to zookeeper, to point out to embedded instance instead of "main" zookeeper of the cluster, but couldn't solve this at the moment...
... View more
06-20-2017
07:47 AM
Hi, I'm struggling with ambari metrics since a couple of days and cannot figure out how to investigate further : Basically, I have a secured (kerberized) HDP 2.5 cluster and I would like to post custom metrics in ambari metrics. It's worth saying that timeline.metrics.service.operation.mode property has "embedded" value, which means (if I understood properly) that ams has an embedded hbase instance with its own zookeeper. Let's name the server running ambari-metrics collector : server1.mydomain.com I gave a try with following requests : curl -H "Content-Type: application/json" -X POST -d '{"metrics": [{"metricname": "AMBARI_METRICS.SmokeTest.FakeMetric", "appid": "amssmoketestfake", "hostname": "server1.mydomain.com", "timestamp": 1432075898000, "starttime": 1432075898000, "metrics": {"1432075898000": 0.963781711428, "1432075899000": 1432075898000}}]}' "http://server1.mydomain.com:6188/ws/v1/timeline/metrics" => Returned HTTP 200 code, with following json data : {"errors":[]} Then, I tried to retrieve this dummy metrics with following request : curl -H "Content-Type: application/json" -X GET "http://server1.mydomain.com:6188/ws/v1/timeline/metrics?metricNames=AMBARI_METRICS.SmokeTest.FakeMetric&appId=amssmoketestfake&hostname=server1.mydomain.com" => Returned HTTP 200 code, with following json data : {"metrics":[]} Trying to figure out why my metrics dont come up in this GET request, but I'm facing security concerns : First, I want to connect to phoenix/hbase to check if metrics were properly stored. I checked following properties in /etc/ambari-metrics-collector/cinf/hbase.site file : - property hbase.zookeeper.property.clientPort : 61181 - zookeeper.znode.parent : /ams-hbase-secure So I gave a try to following command : /usr/hdp/current/phoenix-client/bin/sqlline.py server1.mydomain.com:61181:/ams-hbase-secure I receive following warning every 15 seconds, and connection never succeeds : 17/06/20 08:46:04 WARN ipc.AbstractRpcClient: Couldn't setup connection for myuser@mydomain.com to hbase/server1.mydomain.com@mydomain.com => Should I use a particular user to execute this command (ams, hbase, ... ?). Is it simply possible to connect to embedded phoenix instance like this ? I also tried out to connect to the embedded hbase's zookeeper instance with following command : zookeeper-client -server server1.mydomain.com:61181 But couldn't connect and received following errors : 2017-06-20 09:30:24,306 - ERROR [main-SendThread(server1.mydomain.com:61181):ZooKeeperSaslClient@388] - An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]) occurred when evaluating Zookeeper Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state. 2017-06-20 09:30:24,306 - ERROR [main-SendThread(server1.mydomain.com:61181):ClientCnxn$SendThread@1059] - SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]) occurred when evaluating Zookeeper Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state. What's wrong with this ? I performed a kinit operation beforehand, but it seems that my ticket is not granted sufficient permissions...Should I try to connect with specific user in order to read zookeeper content ? Thanks for your help
... View more
Labels:
- Labels:
-
Apache Ambari
-
Apache HBase
06-15-2017
02:34 PM
Thanks a lot, it fixed the problem. Dont know why this property was set like this "out of the box" in the sandbox...
... View more