If you made it this far, you probably already know why you're here. Connecting Dbeaver to an Impala virtual warehouse isn't difficult, but there are a few gotchas that can make it frustrating. So let's conquer those obstacles.
Provision an Impala Virtual Warehouse
Log into a CDP instance
Navigate toData Warehouse
Enable your Data Warehouse Environment & Database Catalog
Create aNew Virtual Warehouse
Give your virtual warehouse a unique name
SelectImpala
Select your database catalog from the dropdown
SSO may be enabled or disabled, the choice is yours!
Set Availability Zone and User Groups per your requirements or leave as-is
Pick the size (aka decide how much money you want to spend)
The remaining options are going to depend on your needs, but the defaults are fine for our purposes.
ClickCreate. Expect approximately 5 minutes to create your virtual warehouse.
Download the JDBC Driver
From your virtual warehouse tile, click on the kebab icon in the upper right. You'll find all sorts of fun options under there, but we're primarily interested in the Download JDBC/ODBC Driver option, which will download the Impala jar to your local machine. You can leave it in your Downloads folder, or move it to wherever you like to store your jars. It will be named similar to this:
impala_driver_jdbc_odbc.zip
You'll need to unzip it, which will create a new folder named impala_driver_jdbc_odbc. Inside that folder will be two additional folders, we're interested in the JDBC folder, named something like this:
ClouderaImpala_JDBC-2.6.23.1028
Within that folder will be the actual Impala drivers, named for JDBC versions 4.1 and 4.2. More info on these versions can be found here. You don't need to unzip these any further.
Again, from the kebab icon in your virtual warehouse tile, copy the JDBC URL. This URL has all the necessary information to make the connection, and should be of the form:
Next, we will create a new connection within DBeaver.
Create a new connection in DBeaver, selectingCloudera Impalaas the database driver.
Click Edit Driver Settings to tweak the URL template
Remove the jdbc:impala:// prefix from the URL template
Remove the :{port} from the URL template
The new URL template should look like this: {host}/{database}
Click OK
For theHost, paste the JDBC URL you copied earlier.
Leave theportempty
Set theDatabase/Schemato the name of the database you want to connect in as (i.e. default)
Username/Password:
If the warehouse is SSO-enabled, use your SSO credentials.
If the warehouse is not SSO-enabled, use your CDP workload credentials.
Add the Impala driver
Click on theEdit Driver Settingsbutton
DBeaver may have installed with a driver for Impala, but you may find it to not be fully compatible with your virtual warehouse. Open theLibrariestab andDeletethe existing drivers to avoid any conflict.
ClickAdd Fileto add the Impala driver (the 41 or 42 zip file) you downloaded earlier.
Click OK
Click Test Connectionand verify that you can connect. If your virtual warehouse is SSO-enabled, DBeaver will open a browser tab to allow you to authenticate if you aren't already so authenticated.
Click Finishto save the new connection.
Once the connection is created, you can navigate the database, tables, columns, etc, as well as query your data. Congratulations, you did it (and there was much rejoicing).
Tips
If you have connectivity even after successfully testing your connection, doing anInvalidate/Reconnector a fullDisconnect + Reconnectto reset the connection.
If a query seems to take a long time to run, check the status of your virtual warehouse, it is likely that it was stopped and needs to restart to execute the query.
You may need your cloud firewall rules set to allow traffic on port 443 from your IP address.