06-23-2016 06:04 PM
I am able to connect without a problem to Impala via ODBC with the following connection string:
'Driver=Cloudera ODBC Driver for Impala;Host=sasgridprod.bancolombia.corp;Port=21050;AuthMech=1;SSL=1;KrbRealm=BANCOLOMBIA.CORP;KrbFQDN=sasgridprod.bancolombia.corp;KrbServiceName=impala;TrustedCerts=D:/_DATOS/sasgridprod.bancolombia.corp.pem'
Now I am trying to do the same via jdbc, I am using the JDBC4 driver (latest version 188.8.131.521, the class is com.cloudera.impala.jdbc4.Driver) .
I am trying to follow the instruccions here: Cloudera JDBC-Driver for Impala Install Guide
To setup the connection jdbc string:
I have tried several alternatives, such as
which would be the closest analog to my working ODBC string.
When I try to connect I get the following exception:
java.util.concurrent.ExecutionException: java.lang.RuntimeException: java.sql.SQLException: [Simba][ImpalaJDBCDriver](500164) Error initialized or created transport for authentication: [Simba][ImpalaJDBCDriver](500169) Unable to connect to server: javax.net.ssl.SSLHandshakeException: sun.security.validator.ValidatorException: PKIX path building failed: sun.security.provider.certpath.SunCertPathBuilderException: unable to find valid certification path to requested target.
Then I noticed that the TrustedCerts property doesn't seem to be there for JDBC. So I tried replacing this by cacerts and also by jssecacerts as this is mentioned on page 18 of the guide above:
The stack trace is the same as before...
Any ideas on how to properly build the jdbc string given the fully working ODBC string above?
(By the way I have MIT Kerberos Ticket manager installed and I have been careful to define the KRB5CCNAME environment variable pointing to the tickets file.
Many thanks in advance.
06-24-2016 08:47 AM
Update: I managed to connect.
The one thing I did was to add the SSL certificates I had on my .pem file directly to the "certificate store" used by Java.
It did this via the following command:
keytool –import –alias sasgridprod -keystore “C:\Program Files\Java\jre1.8.0_40\lib\security\cacerts” –file sasgridprod.bancolombia.corp.pem
when asked I used the default password for the cacerts file which is: changeit
After that I restarted my Java client application and was able to connect.
The client application I am trying to use is SquirrelSQL.
After connecting, the app hangs for a long while and finally comes back. I am assuming this caused by our database already having around 2000 tables and SquirrelSQL by default fetches information about all the schemas...
It seems there are two ways around this.
One is to configure it to not load any schemas at all. But then, it hangs when writing an SQL statement, not sure why.
Another is to configure it to load and caches all schemas. Then only the first time one opens the schemas list it hangs as before, but the second time it uses the cache and doesn't reload and thus it is very responsive.
However, in this casem Squirrel-sql still hangs at other places. Not sure whether this is due to the app itself or to the fetching of table metadata from being slow ...
Does anybody have an idea about this or suggestions about other Java based SQL clients that can connect to Impala (and give you full control of the connection string) and might work better?
01-18-2017 11:43 PM
I'm getting same error with BO
Though I tried testing the driver and connectivity from R [though in background that is using JDBC only]. Following code works without any error.
drvH <- JDBC(driverClass = "com.simba.hive.jdbc4.HS2Driver", classPath = normalizePath(list.files("Drivers/BO-Simba/BO_Drivers/hive012simba4server1/", pattern = ".jar$", full.names = T, recursive = T))) connH <- dbConnect(drvH, "jdbc:hive2://master1.rbi.org.in:10000;AuthMech=1;KrbRealm=MYREALM.COM;KrbHostFQDN=master1.rbi.org.in;KrbServiceName=hive") dbGetQuery(connH, "show databases")
But following code
drvI <- JDBC(driverClass = "com.simba.impala.jdbc4.Driver", classPath = normalizePath(list.files("Drivers/BO-Simba/BO_Drivers/impala10simba4/", pattern = ".jar$", full.names = T, recursive = T))) connI <- dbConnect(drvI, "jdbc:impala://slave1.rbi.org.in:21050;AuthMech=1;KrbRealm=MYREALM.COM;KrbHostFQDN=master1.rbi.org.in;KrbServiceName=impala") # getting error [Simba][ImpalaJDBCDriver](500164) Error initialized or created transport for authentication: Unable to connect to server
Gives the error
[Simba][ImpalaJDBCDriver](500164) Error initialized or created transport for authentication: Unable to connect to server
Kindly help if you know the reason. I have not enabled SSL in the cluster. I have Kerberos and Sentry in CDH 5.9 [OS RedHat 6]. Client is as of now one of the nodes in the cluster [minial firewall intervention].
Hive works but Impala gives these problem. I have tried with Cloudera Drivers too [again Hive works not Impala].
01-27-2017 06:04 AM