Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

JDBC client to Hive - No data or no sasl data in the stream Exception

avatar
Explorer

Hi Everybody,

We have a Kerberised cluster and I'm trying to run a Java action in Oozie where I make a JDBC connection to Hive. This JDBC connections works fine on the Sandbox without Kerberos.

The connection string is as simple as the following, where I'm providing username and password in it:

Connection con = DriverManager.getConnection("jdbc:hive2://W12345:10000/control;principal=hive/W12345.companynet.net@COMPANYNET.NET","user123","passw123");

The Oozie action completes succesfully, and the Java log does not present any error:

1742 [main] INFO org.apache.hive.jdbc.Utils  - Supplied authorities: W12345:10000
1742 [main] INFO org.apache.hive.jdbc.Utils  - Resolved authority: W12345:10000
1766 [main] INFO org.apache.hive.jdbc.HiveConnection  - Will try to open client transport with JDBC Uri: jdbc:hive2://W12345:10000/control;principal=hive/W12345.companynet.net@COMPANYNET.NET
<<< Invocation of Main class completed <<<
Oozie Launcher ends
1785 [main] INFO org.apache.hadoop.mapred.Task  - Task:attempt_1464245290012_0129_m_000000_0 is done. And is in the process of committing
1847 [main] INFO org.apache.hadoop.mapred.Task  - Task attempt_1464245290012_0129_m_000000_0 is allowed to commit now
1854 [main] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter  - Saved output of task 'attempt_1464245290012_0129_m_000000_0' to hdfs://danskehadoop/user/user123/oozie-oozi/0000013-160527101253015-oozie-oozi-W/JavaAction--java/output/_temporary/1/task_1464245290012_0129_m_000000
1909 [main] INFO org.apache.hadoop.mapred.Task  - Task 'attempt_1464245290012_0129_m_000000_0' done.

But the Java main does not actually complete the execution as the JDBC connection fails with an exception that I can see only in the Hive log:

	ERROR [HiveServer2-Handler-Pool: Thread-78363]: server.TThreadPoolServer (TThreadPoolServer.java:run(296)) - Error occurred during processing of message.
java.lang.RuntimeException: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream
  at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:219)
  at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:739)
  at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory$1.run(HadoopThriftAuthBridge.java:736)
  at java.security.AccessController.doPrivileged(Native Method)
  at javax.security.auth.Subject.doAs(Subject.java:360)
  at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1637)
  at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge$Server$TUGIAssumingTransportFactory.getTransport(HadoopThriftAuthBridge.java:736)
  at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:268)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.thrift.transport.TSaslTransportException: No data or no sasl data in the stream
  at org.apache.thrift.transport.TSaslTransport.open(TSaslTransport.java:328)
  at org.apache.thrift.transport.TSaslServerTransport.open(TSaslServerTransport.java:41)
  at org.apache.thrift.transport.TSaslServerTransport$Factory.getTransport(TSaslServerTransport.java:216)
  ... 10 more

Thanks in advance for the help!

5 REPLIES 5

avatar
Master Guru

You can try to add the Hive2 credential to your java action but I am afraid that it is not supported for a generic Java action ( still worth a try),

https://oozie.apache.org/docs/4.2.0/DG_ActionAuthentication.html

if you need to get it yourself you would have to programatically read the keytab and get the ticket yourself:

https://www.ibm.com/support/knowledgecenter/SSPT3X_3.0.0/com.ibm.swg.im.infosphere.biginsights.admin...

And finally why don't you use a more normal authentication mechanism for hive like PAM/LDAP ( ceterum censeo cartaginem delendam esse )

avatar
Explorer

Hi Benjamin, Thank you for your reply.

Option 1 - did not work, as expected 🙂

Option 2 - In the link, at step b2, there's this command:

UserGroupInformation.loginUserFromKeytab("example_user@IBM.COM", "/path/to/example_user.keytab");

Does this mean that my user (later on it will be a system user) needs to have a keytab created on the linux file system and distributed to all the nodes? Moreover, it might not be a great option, but isn't this authentication possible using only username/password ?

Option 3 - I'm using the current mechanism as it is the only one I found some examples on the net. I checked shortly on PAM/LDAP, I'm not sure yet if that will require some changes from the Hadoop cluster side. If not, I'll be happy to try it

avatar
Master Guru

"Does this mean that my user (later on it will be a system user) needs to have a keytab created on the linux file system and distributed to all the nodes?"

You would put the keytab in HDFS with access rights for only the user and use the oozie files tag to load it to your temp execution directory,

https://oozie.apache.org/docs/3.2.0-incubating/WorkflowFunctionalSpec.html#a3.2.7_Java_Action

"Moreover, it might not be a great option, but isn't this authentication possible using only username/password ?"

To do this you need PAM or LDAP authentication, thats why I mentioned it :-). You can either hardcode it or do the same thing we discussed above with a password file in hdfs. For this you can set access rights.

"Option 3 - I'm using the current mechanism as it is the only one I found some examples on the net. I checked shortly on PAM/LDAP, I'm not sure yet if that will require some changes from the Hadoop cluster side. If not, I'll be happy to try it"

https://community.hortonworks.com/articles/591/using-hive-with-pam-authentication.html

🙂

avatar
Expert Contributor

Hi Antonio Arfe

Can you connect to HiveServer2 via beeline with your connection string?

If you can, the error may suggest that an yarn app is not starting for some reason. You can verify that by visiting resource manager UI.

avatar
Explorer

Hi Tahahiko, Yes, it works from beeline (without username/password), but I believe it is because I connect to beeline via CLI, which means I already have a Kerberos authentication for my username. I'm not sure what I should look for in Resource UI, it just shows the successful map/reduce job connected to my Java action, the log does not seem to contain much.