Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Error while connecting to hive from Spark using jdbc connection string

Solved Go to solution

Error while connecting to hive from Spark using jdbc connection string

New Contributor

I am trying to connect to Hive from Spark inside the map function like below

String driver = "org.apache.hive.jdbc.HiveDriver"; Class.forName(driver); Connection con_con1 = DriverManager .getConnection( "jdbc:hive2://server1.net:10001/default;principal=hive/server1.net@abc.xyz.NET;ssl=false;transportMode=http;httpPath=cliservice" , "username", "password");

But i am getting error.

javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Failed to find any Kerberos tgt)]

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Error while connecting to hive from Spark using jdbc connection string

New Contributor

Ok I have resolved the kerberos ticket issue by using below line in my java code.

System.setProperty("java.security.auth.login.config","gss-jaas.conf");

System.setProperty("sun.security.jgss.debug","true");

System.setProperty("javax.security.auth.useSubjectCredsOnly","false");

System.setProperty("java.security.krb5.conf","krb5.conf");

But I am getting different error now.

WARN scheduler.TaskSetManager: Lost task 1.0 in stage 1.0 (TID 7, SGSCAI0068.inedc.corpintra.net): java.sql.SQLException: Could not open connection to jdbc:hive2://**********.inedc.corpintra.net:10001/default;principal=hive/******.inedc.corpintra.net@*****;transportMode=http;httpPath=cliservice: null at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:206) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:178) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at GetDataFromXMLStax.returnXmlTagMatchStatus(GetDataFromXMLStax.java:77) at SP_LogFileNotifier_FileParsingAndCreateEmailContent$2.call(SP_LogFileNotifier_FileParsingAndCreateEmailContent.java:200) at SP_LogFileNotifier_FileParsingAndCreateEmailContent$2.call(SP_LogFileNotifier_FileParsingAndCreateEmailContent.java:1) at org.apache.spark.api.java.JavaPairRDD$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1027) at scala.collection.Iterator$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$anon$11.next(Iterator.scala:328) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13$anonfun$apply$6.apply$mcV$sp(PairRDDFunctions.scala:1109) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13$anonfun$apply$6.apply(PairRDDFunctions.scala:1108) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13$anonfun$apply$6.apply(PairRDDFunctions.scala:1108) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1285) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13.apply(PairRDDFunctions.scala:1116) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13.apply(PairRDDFunctions.scala:1095) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:258) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:203) ... 22 more

Seems the connection string is not correct. I am calling this connection string from inside of a spark program.

Connection con_con1 = DriverManager .getConnection( "jdbc:hive2://**********.inedc.corpintra.net:10001/default;principal=hive/**********.inedc.corpintra.net@*****.NET;transportMode=http;httpPath=cliservice");

Please help me with the correct connection string.

7 REPLIES 7
Highlighted

Re: Error while connecting to hive from Spark using jdbc connection string

Contributor

Looks like the credential issue. Check the credentials entered are correct.

Re: Error while connecting to hive from Spark using jdbc connection string

New Contributor

The credentials are correct. I am using the same connection string from edge node and it works fine, but not from the spark program.

Re: Error while connecting to hive from Spark using jdbc connection string

Expert Contributor

I don't know if this applies to your situation, but there is a Java issue that will cause that type of exception with Kerberos 1.8.1 or higher. If your error is related to this, for a renewable ticket run kinit -R after initially running kinit. There are some additional suggestions at https://community.hortonworks.com/articles/4755/common-kerberos-errors-and-solutions.html.

Re: Error while connecting to hive from Spark using jdbc connection string

Contributor

Your cluster is kerberized and this issue looks like kerberos ticket issue.

you dont have a valid kerberos ticket. to run this command

first get a valid ticket for your user to run this commad.

To get a valid ticket type

#kinit

then it will ask for kerberos password.

after you enter password

#klist

it will display kerberos ticket for your user.

now try running command to connect to hive.

Re: Error while connecting to hive from Spark using jdbc connection string

New Contributor

Ok I have resolved the kerberos ticket issue by using below line in my java code.

System.setProperty("java.security.auth.login.config","gss-jaas.conf");

System.setProperty("sun.security.jgss.debug","true");

System.setProperty("javax.security.auth.useSubjectCredsOnly","false");

System.setProperty("java.security.krb5.conf","krb5.conf");

But I am getting different error now.

WARN scheduler.TaskSetManager: Lost task 1.0 in stage 1.0 (TID 7, SGSCAI0068.inedc.corpintra.net): java.sql.SQLException: Could not open connection to jdbc:hive2://**********.inedc.corpintra.net:10001/default;principal=hive/******.inedc.corpintra.net@*****;transportMode=http;httpPath=cliservice: null at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:206) at org.apache.hive.jdbc.HiveConnection.<init>(HiveConnection.java:178) at org.apache.hive.jdbc.HiveDriver.connect(HiveDriver.java:105) at java.sql.DriverManager.getConnection(DriverManager.java:571) at java.sql.DriverManager.getConnection(DriverManager.java:233) at GetDataFromXMLStax.returnXmlTagMatchStatus(GetDataFromXMLStax.java:77) at SP_LogFileNotifier_FileParsingAndCreateEmailContent$2.call(SP_LogFileNotifier_FileParsingAndCreateEmailContent.java:200) at SP_LogFileNotifier_FileParsingAndCreateEmailContent$2.call(SP_LogFileNotifier_FileParsingAndCreateEmailContent.java:1) at org.apache.spark.api.java.JavaPairRDD$anonfun$toScalaFunction$1.apply(JavaPairRDD.scala:1027) at scala.collection.Iterator$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$anon$11.next(Iterator.scala:328) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13$anonfun$apply$6.apply$mcV$sp(PairRDDFunctions.scala:1109) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13$anonfun$apply$6.apply(PairRDDFunctions.scala:1108) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13$anonfun$apply$6.apply(PairRDDFunctions.scala:1108) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1285) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13.apply(PairRDDFunctions.scala:1116) at org.apache.spark.rdd.PairRDDFunctions$anonfun$saveAsHadoopDataset$1$anonfun$13.apply(PairRDDFunctions.scala:1095) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:63) at org.apache.spark.scheduler.Task.run(Task.scala:70) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) at org.apache.thrift.transport.TTransport.readAll(TTransport.java:84) at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:182) at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:258) at org.apache.thrift.transport.TSaslClientTransport.open(TSaslClientTransport.java:37) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:52) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport$1.run(TUGIAssumingTransport.java:49) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:415) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657) at org.apache.hadoop.hive.thrift.client.TUGIAssumingTransport.open(TUGIAssumingTransport.java:49) at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:203) ... 22 more

Seems the connection string is not correct. I am calling this connection string from inside of a spark program.

Connection con_con1 = DriverManager .getConnection( "jdbc:hive2://**********.inedc.corpintra.net:10001/default;principal=hive/**********.inedc.corpintra.net@*****.NET;transportMode=http;httpPath=cliservice");

Please help me with the correct connection string.

Re: Error while connecting to hive from Spark using jdbc connection string

Super Guru

Why connect to hive via JDBC? Use SparkSQL with the HiveContext and have full access to all the HiveTables. This is optimized in spark and very fast.

https://spark.apache.org/docs/1.6.0/sql-programming-guide.html

Re: Error while connecting to hive from Spark using jdbc connection string

New Contributor

Yes I agree but I am trying to run hive query inside the map method. While using hive context, I am getting error that class is not serializable.

Don't have an account?
Coming from Hortonworks? Activate your account here