Member since
04-24-2024
3
Posts
2
Kudos Received
0
Solutions
05-01-2024
03:15 AM
1 Kudo
@smruti: 1. Agreed, I don't strictly need to get the jsqsh connection working, it'd just be nice to do so in order to test things more directly. I am confused by the client is somehow having compatibility issues despite using the deployed Hive version's own jars, though. (The classpath I'm using - the default in JSQSH - is ${HIVE_HOME}/lib:${HADOOP_HOME}, which works out to /opt/cloudera/parcels/CDH/lib/hive/lib:/opt/cloudera/parcels/CDH/lib/hadoop.) However, we can set that aside. 2. I agree that this seems like the root of the problem, but I can't determine what certs/truststores I need. I've tried copying the truststore that works on the CDP cluster itself, but it doesn't work from the remote host. I've tried importing all the certs I can find, but none seem to work. I'm probably missing something very obvious, but I don't know what. So for example, I scp /var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks from the CDP cluster to /tmp/cm-auto-global_truststore.jks on the remote agent. I can use keytool -list to confirm that I have the right password and can see the certs contained: Keystore type: jks Keystore provider: SUN Your keystore contains 1 entry cmrootca-0, 23-Apr-2024, trustedCertEntry, Certificate fingerprint (SHA-256): D8:03:7A:38:7A:D7:B5:AC:A4:FC:78:5D:AB:5A:1B:9F:2D:2A:0E:96:FD:11:10:E4:D4:54:73:97:9A:0A:E9:71 I copy the Hive keytab and kinit and that's all good: Default principal: hive/cluster-head.domain.com@REALM.COM Valid starting Expires Service principal 01/05/24 02:51:38 02/05/24 02:51:38 krbtgt/REALM.COM@REALM.COM renew until 06/05/24 02:51:38 So everything seems present and correct. However the connection attempts look like: 24/05/01 02:58:25 INFO jdbc.HiveConnection: Could not open client transport with JDBC Uri: jdbc:hive2://cluster-head.domain.com:10000/default;user=hive;password=hive;principal=hive/cluster-head.domain.com@REALM.COM;ssl=true;sslTrustStore=/tmp/cm-auto-global_truststore.jks;trustStorePassword=... 24/05/01 02:58:25 INFO jdbc.HiveConnection: Transport Used for JDBC connection: null 2024-05-01 02:55:06 | javax.ws.rs.ProcessingException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target Are there more or different certs I need to be importing on the remote agent, beyond what's in the keystore that works (in Beehive) when connecting from the same machine?
... View more
04-30-2024
08:25 AM
1 Kudo
Thanks @tj2007. I've seen that article but I'm not clear on how to apply it. The application I'm using doesn't seem to use the Cloudera JDBC driver and I don't think I can add it. I note that the format of the options specified there doesn't resemble the strings used by Beehive, which seems to be the format my application uses as well. Is there a solution that I can use with the Hive/Beehive format? I've tried using the Cloudera driver before and been unable to get it to work, previously due to driver/class errors, now with this format for new reasons: [DEBUG] 2024-04-30 08:16:44.942 [main] TSaslTransport - opening transport org.apache.thrift.transport.TSaslClientTransport@2b9f74d0 [DEBUG] 2024-04-30 08:16:44.943 [main] TSaslClientTransport - Sending mechanism name PLAIN and initial response of length 10 [DEBUG] 2024-04-30 08:16:44.946 [main] TSaslTransport - CLIENT: Writing message with status START and payload length 5 [DEBUG] 2024-04-30 08:16:44.951 [main] TSaslTransport - CLIENT: Writing message with status COMPLETE and payload length 10 [DEBUG] 2024-04-30 08:16:44.952 [main] TSaslTransport - CLIENT: Start message handled [DEBUG] 2024-04-30 08:16:44.952 [main] TSaslTransport - CLIENT: Main negotiation loop complete [DEBUG] 2024-04-30 08:16:44.952 [main] TSaslTransport - CLIENT: SASL Client receiving last message [WARN ] 2024-04-30 08:16:44.954 [main] HiveConnection - Failed to connect to...
... View more
04-30-2024
06:54 AM
I'm trying to get a connection to Hive after enabling SSL. I can already connect with Kerberos, but it's the addition of SSL specifically that's giving me trouble. The application I need to connect is using JDBC in Java. I don't have much control over the internals, just the connection string. I'm trying to get a working connection via JSQSH first as a proof of concept. As a simplest possible case I've tried just logging into the cluster as the hive user and running `hive`. That's able to connect, and I can see the Beeline connection info: Connecting to jdbc:hive2://cluster-head.domain.com:2181,cluster-standby.domain.com:2181,cluster-w-1.domain.com:2181/default;password=hive;principal=hive/_HOST@REALM.COM;serviceDiscoveryMode=zooKeeper;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks;trustStorePassword=...;trustStoreType=jks;user=hive;zooKeeperNamespace=hiveserver2 24/04/30 06:40:03 [main]: INFO jdbc.HiveConnection: Connected to cluster-head.domain.com:10000 I can simplify this considerably and get a working connection via Beeline: Connecting to jdbc:hive2://cluster-head.domain.com:10000/default;principal=hive/_HOST@DOMAIN.COM;ssl=true;sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks Connected to: Apache Hive (version 3.1.3000.7.1.9.3-4) Driver: Hive JDBC (version 3.1.3000.7.1.9.3-4) Transaction isolation: TRANSACTION_REPEATABLE_READ Beeline version 3.1.3000.7.1.9.3-4 by Apache Hive 0: jdbc:hive2://cluster-head.domain> However, if I try from JSQSH - despite the fact that it's pulling the driver from the same path - I get a classpath error: [hive@cluster-head ~]$ jsqsh -d hive2 -u "jdbc:hive2://cluster-head.domain.com:10000/default;\ principal=hive/_HOST@DOMAIN.COM;ssl=true;\ sslTrustStore=/var/lib/cloudera-scm-agent/agent-cert/cm-auto-global_truststore.jks" Password: 2024-04-30 05:49:01,772 main ERROR Unable to locate appender "LOGFILE" for logger config "guru.springframework.blog.log4j2properties" [DEBUG] 2024-04-30 05:49:01.980 [main] Utils - Resolved authority: cluster-head.domain.com:10000 [DEBUG] 2024-04-30 05:49:02.290 [main] MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName="Ops", about="", type=DEFAULT, value={"Rate of successful kerberos logins and latency (milliseconds)"}, valueName="Time") [DEBUG] 2024-04-30 05:49:02.296 [main] MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName="Ops", about="", type=DEFAULT, value={"Rate of failed kerberos logins and latency (milliseconds)"}, valueName="Time") [DEBUG] 2024-04-30 05:49:02.297 [main] MutableMetricsFactory - field org.apache.hadoop.metrics2.lib.MutableRate org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName="Ops", about="", type=DEFAULT, value={"GetGroups"}, valueName="Time") [DEBUG] 2024-04-30 05:49:02.297 [main] MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeLong org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailuresTotal with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName="Ops", about="", type=DEFAULT, value={"Renewal failures since startup"}, valueName="Time") [DEBUG] 2024-04-30 05:49:02.298 [main] MutableMetricsFactory - field private org.apache.hadoop.metrics2.lib.MutableGaugeInt org.apache.hadoop.security.UserGroupInformation$UgiMetrics.renewalFailures with annotation @org.apache.hadoop.metrics2.annotation.Metric(always=false, sampleName="Ops", about="", type=DEFAULT, value={"Renewal failures since last successful login"}, valueName="Time") [DEBUG] 2024-04-30 05:49:02.304 [main] MetricsSystemImpl - UgiMetrics, User and group related metrics java.lang.NoClassDefFoundError: com/ctc/wstx/io/InputBootstrapper at org.apache.hadoop.security.UserGroupInformation.ensureInitialized(UserGroupInformation.java:314) at org.apache.hadoop.security.UserGroupInformation.doSubjectLogin(UserGroupInformation.java:1997) at org.apache.hadoop.security.UserGroupInformation.createLoginUser(UserGroupInformation.java:743) at org.apache.hadoop.security.UserGroupInformation.getLoginUser(UserGroupInformation.java:693) at org.apache.hadoop.hive.metastore.security.HadoopThriftAuthBridge.createClientWithConf(HadoopThriftAuthBridge.java:95) at org.apache.hive.service.auth.KerberosSaslHelper.getKerberosTransport(KerberosSaslHelper.java:57) at org.apache.hive.jdbc.HiveConnection.createBinaryTransport(HiveConnection.java:841) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:457) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:335) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:107) at org.sqsh.SQLDriverManager$DriverShim.connect(SQLDriverManager.java:136) at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:677) at java.sql/java.sql.DriverManager.getConnection(DriverManager.java:189) at org.sqsh.SQLDriverManager.connect(SQLDriverManager.java:660) at org.sqsh.JSqsh.doConnect(JSqsh.java:535) at org.sqsh.JSqsh.main(JSqsh.java:249) Caused by: java.lang.ClassNotFoundException: com.ctc.wstx.io.InputBootstrapper at java.base/java.net.URLClassLoader.findClass(URLClassLoader.java:476) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:594) at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:527) ... 16 more When I try from another machine, using the application I want to get working, I get certificate errors: 2024-04-24 08:40:35 | javax.ws.rs.ProcessingException: javax.net.ssl.SSLHandshakeException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target As mentioned, Kerberos-only connections work fine; I'm able to connect with URLs that look like "jdbc:hive2://cluster-head.domain.com:10000/default;principal=hive/_HOST@DOMAIN.COM". I'm trying to understand what the string should look like when SSL is also enabled, and what certs I need to import, or what other setup I need to do beforehand in order for connections to work.
... View more
Labels: