Continuing my series of how-to articles for CDP, today we explore how to connect to Impala via JDBC in JSON. In my example, I will use a Jupyter notebook running in CML, but this can be generalized.
This process is actually fairly easy, so let's dive in.
CLASSPATH=.:/home/cdsw/ImpalaJDBC4.jar:/home/cdsw/ImpalaJDBC41.jar:/home/cdsw/ImpalaJDBC42.jar
export CLASSPATH
pip3 install JayDeBeApi
pip3 install --upgrade jpype1==0.6.3 --user
import jaydebeapi
conn = jaydebeapi.connect("com.cloudera.impala.jdbc.DataSource",
"jdbc:impala://[your_host]:443/;ssl=1;transportMode=http;httpPath=icml-data-mart/cdp-proxy-api/impala;AuthMech=3;",
{'UID': "[your_cdp_user]", 'PWD': "[your_workload_pwd]"},
'/home/cdsw/ImpalaJDBC41.jar')
curs = conn.cursor()
curs.execute("select * from default.locations")
curs.fetchall()
curs.close()
conn.close()
The following is a screenshot of my code in action:
Created on 03-23-2021 09:59 AM
Is there any way to have CDSW connect to Impala to run straight SQL ?
I have searched my tutuorials and suggestion on the Web and have found none that work with CDSW in our environment.
Created on 03-24-2021 07:20 PM - edited 03-24-2021 07:22 PM
Hello,
Nice tutorial, this library is fast!
If anyone is running into
java.sql.SQLExceptionPyRaisable: java.sql.SQLException: [Cloudera][ImpalaJDBCDriver](500605) Error occurred while opening a session with the server. No additional detail from the server regarding this error is available. Please ensure that the driver configuration is compatible with the server configuration. This type of error can also occur when the server is too busy to handle the request. Please try again later.
I was able to fix it by changing the httpPath parameter in the impala hostname from "icml-data-mart/cdp-proxy-api/impala" to
to "cliservice" as follows:
"jdbc:impala://"+os.environ["IMPALA_HOST"]+":443/;ssl=1;transportMode=http;httpPath=cliservice;AuthMech=3;"
Hope this helps anyone!
Created on 08-19-2021 06:24 AM
Hello, I am running this from the company network and I believe we have some sort of certificate for using cloudera-impala. When I copy the URL from the impala_prod it gives me at the end also a uid(which is my ID) and a password which is a standard password(not given by me at any point in time).
So when I run this script this is the error I recieve:
java.sql.SQLException: java.sql.SQLException: [Cloudera][ImpalaJDBCDriver](500170) Error occurred while setting up ALTUS Dynamic Discovery: Unable to load credentials from provider files.
Do you have any ideas how can I fix this?