Created on 04-17-2020 08:07 AM - edited on 04-30-2020 12:47 AM by VidyaSargur
In this article, I will document how to use CFM 1.0.1.0 to interact with Apache Impala. This article still applies if using HDF / Apache NiFi
The latest official JDBC driver that will work when using NiFi is the JDBC driver 2.6.4 or below.
At the time of this writing, any driver above that causes class conflicts with the NiFi JVM and the driver's own use of log4j.
Pre-requisite
Downloading and extracting the JDBC driver:
Create Impala table and load dataset sample to HDFS:
hdfs dfs -put data/tips.csv /user/hive/warehouse/tips/
impala-shell -i <impala_daemon_hostname>:21000 -q ' CREATE TABLE default.tips ( `total_bill` FLOAT, `tip` FLOAT, `sex` STRING, `smoker` STRING, `day` STRING, `time` STRING, `size` TINYINT) ROW FORMAT DELIMITED FIELDS TERMINATED BY "," LOCATION "hdfs:///user/hive/warehouse/tips/";'
* These steps were taken from this article.
Configure the Nifi to interact with Impala:
On NiFi drag processor ExecuteSQL
Configure Database Connection Pooling Service on the ExecuteSQL processor
This is a pointer to the DBCPConnectionPool controller service that you will need to configure:
The driver documentation is really good at explaining the different settings you can pass. If you will interact with an Impala that is TLS secured and / or Kerberos there are options for that. In my example, I am interacting with a TLS and Kerberized Impala.
On the controller service section configure your DBCPConnectionPool and add the following:
My example:
jdbc:impala://YourImpalaHostFQDN:YourPort
com.cloudera.impala.jdbc41.Driver
The following is the path to the JDBC driver (ImpalaJDBC41.jar) you downloaded earlier:
Back in the ExecuteSQL processor, add your SQL command. For this example, we are running a simple select query. By configuring SQL select query = select * from default.tips
That should be all you need.
If interacting with a TLS and / or Kerberos Impala, then you will need to look at the driver documentation for the options that apply to you. For reference, my connect string looked like below when connecting to a TLS and Kerberos Impala:
jdbc:impala://MyImpalaHost:21050;AuthMech=1;KrbHostFQDN=MyImpalaHostFQDN;KrbServiceName=impala;ssl=1;SSLTrustStore=/My/JKS/Trustore;SSLTrustStorePwd=YourJKSPassword
Created on 03-09-2021 08:43 PM
Tested. Works. Awesome.
Created on 08-22-2022 09:53 AM - edited 08-22-2022 09:55 AM
Does anyone know if there is a way to use impala over TLS that doesn't require you putting passwords in cleartext? (CFM 2.0.4)
Created on 07-04-2024 12:01 AM
Can i use multiple impala demons in connection string, or is there any way to use multiple impala demons ? CDP 7.1.7