Created on 06-15-2016 03:21 PM - edited 09-16-2022 03:25 AM
I've downloaded the Hortonworks Sandbox 2.4 to develop some tools locally on my machine. One of the first things I want to do is load data into Hive. I've first tried to the regular JDBC connector, which worked but was way to slow.
When doing this I ran across the first interesting issue: the sandbox has authentication enabled and controlled by Ranger. So when I connect using beeline and the URL jdbc:hive2://localhost:10000 I was asked for username and password. However, when connecting from Java, this was not required and could read and insert data. Can someone explain this?
public DataSource dataSource() { return new SimpleDriverDataSource(new HiveDriver(), "jdbc:hive2://localhost:10000/variantdatabase"); }
Then I learned about the streaming API which seemed a better alternative for loading lot's of data into Hive ( regular load file doesn't work for me ). So I started following this article: https://cwiki.apache.org/confluence/display/Hive/Streaming+Data+Ingest#StreamingDataIngest-Streaming... .
Relevant code:
HiveEndPoint hiveEP = new HiveEndPoint("hive2://localhost:10000", "variantdatabase", "variant", null);this.connection = hiveEP.newConnection(true);
However, connecting takes ages, and after a while I get the following message in the client:
17:03:47.742 [main] INFO org.apache.hive.jdbc.HiveConnection - Will try to open client transport with JDBC Uri: jdbc:hive2://localhost:10000/variantdatabase 17:03:48.518 [main] DEBUG o.a.h.h.streaming.HiveEndPoint - Overriding HiveConf setting : hive.txn.manager = org.apache.hadoop.hive.ql.lockmgr.DbTxnManager 17:03:48.519 [main] DEBUG o.a.h.h.streaming.HiveEndPoint - Overriding HiveConf setting : hive.support.concurrency = true 17:03:48.519 [main] DEBUG o.a.h.h.streaming.HiveEndPoint - Overriding HiveConf setting : hive.metastore.execute.setugi = true 17:03:48.519 [main] DEBUG o.a.h.h.streaming.HiveEndPoint - Overriding HiveConf setting : hive.execution.engine = mr 17:03:48.706 [main] WARN o.a.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17:03:48.735 [main] INFO hive.metastore - Trying to connect to metastore with URI hive2://localhost:10000 17:13:48.814 [main] WARN hive.metastore - set_ugi() not successful, Likely cause: new client talking to old server. Continuing without it. org.apache.thrift.transport.TTransportException: java.net.SocketTimeoutException: Read timed out at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:129) ~[hive-exec-1.2.1.jar:1.2.1]
When I look in the server log it says something about SASL, but I don't understand why, because JDBC didn't need it? And where I can define any username/password?
Caused by: org.apache.thrift.transport.TTransportException: Invalid status -128 at org.apache.thrift.transport.TSaslTransport.sendAndThrowMessage(TSaslTransport.java:232) at org.apache.thrift.transport.TSaslTransport.receiveSaslMessage(TSaslTransport.java:184) at org.apache.thrift.transport.TSaslServerTransport.handleSaslStartMessage(TSaslServerTransport.java:125)
Created 06-17-2016 12:31 PM
Ok, so the solution is quite simple here, I tried to connect to the Hive2Server that was running on port 10000 whereas I actually should have connected to the metastore which is running on port 9083 .
hive.server2.authentication is set to NONE and not to NOSASL.
Created 06-16-2016 09:26 AM
@Steven Castelein Hiveserver2 in secure environment by default Authentication mode uses plain SASL.
You can disable it either by setting in In hive-site.xml: hive.server2.authentication= 'NOSASL'
Or To Use SASL: (https://cwiki.apache.org/confluence/display/Hive/Setting+Up+HiveServer2#SettingUpHiveServer2-Integrity/ConfidentialityProtection)
Integrity/Confidentiality Protection
Integrity protection and confidentiality protection (beyond just the default of authentication) for communication between the Hive JDBC driver and HiveServer2 are enabled (Hive 0.12 onward, see HIVE-4911). You can use the SASL QOP property to configure this.
hive-site.xml
has to be set to one of the valid QOP values ('auth', 'auth-int' or 'auth-conf').You can connect via below url
jdbc:hive2://<m/c HS2>:10001/default;principal=<hive princiapl>?transportMode=http;httpPath=cliservice;auth=kerberos;sasl.qop=auth-int (if auth-int is set)
Created 06-16-2016 01:23 PM
Thanks for your response but I still can't get it to work. I tried setting the NOSASL value for the hive.server2.authentication property. Now the following happens:
* Connecting via beeline fails, I'm getting asked for a username / password, but the one I used successfully before now doesn't work.
* I cannot open a JDBC connection anymore:
Caused by: java.sql.SQLException: Could not open client transport with JDBC Uri: jdbc:hive2://localhost:10000/variantdatabase: null at org.apache.hive.jdbc.HiveConnection.openTransport(HiveConnection.java:231)Caused by: org.apache.thrift.transport.TTransportException at org.apache.thrift.transport.TIOStreamTransport.read(TIOStreamTransport.java:132)
Server log:
2016-06-16 13:12:39,018 ERROR [HiveServer2-Handler-Pool: Thread-32]: server.TThreadPoolServer (TThreadPoolServer.java:run(294)) - Thrift error occurred during processing of message. org.apache.thrift.protocol.TProtocolException: Missing version in readMessageBegin, old client? at org.apache.thrift.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:228)
Why is everything so counter-intuitive? If I have authentication enabled, JDBC works without specifying any credentials, if disabled it doesn't? Why can't I just specify username/password for a connection from the streaming digest code using the HiveEndPoint constructor or newConnection() method. Also for beeline, I have authentication disabled, but still get asked for a username / password?
Created 06-17-2016 12:31 PM
Ok, so the solution is quite simple here, I tried to connect to the Hive2Server that was running on port 10000 whereas I actually should have connected to the metastore which is running on port 9083 .
hive.server2.authentication is set to NONE and not to NOSASL.