Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (2)
avatar
Rising Star

Introduction

Before the official release of Cloudera Data Flow (ex. Hortonworks Data Flow), you may want to play with Nifi and Hive on CDH.

However, because CDH 5 is using a fork of Hive 1.1, the HiveQL processors and controller services included on the official Apache release will not work, so you need to have your own, as explained in this article: Connecting NiFi to CDH Hive.

This article is awesome, but does not focus on Kerberos/SSL; since I had to do the configuration myself, I thought I would share the knowledge.

Note: You could use a DBCP connection to connect to Cloudera Hive but it will not allow you to use the proper authentication.

Pre-Requisites

To connect to Hive with SSL and Kerberos, you will need the following:

  • A running version of Nifi (I used Apache 1.9 in this example)
  • A Kerberized CDH cluster with Hive on SSL (I used CDH 5.15 in this example)
  • The certificate to add to your keystore for SSL connection
  • A keytab for a specific user authorized in the cluster
  • The krb5 configuration file from the cluster
  • hive-site.xml, core-site.xml and hdfs-site.xml from your cluster
  • Nifi processors and services compiled for Hive 1.1 on CDH (can be compiled like described in the article I linked to)

Step 1: Add certificate to Java truststore

The goal of this step is to add your certificate to the Java cacerts that is used to run Nifi.

In order to import your certificate, run the following command:

keytool –importcert –alias HS2server -keystore [LOCATION_OF_CACERTS] –file [LOCATION_OF_YOUR_CERTIFICATE]

I'm running on MacOS, so my cacerts is under /Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home/jre/lib/security/cacerts, so I ran:

keytool –importcert –alias HS2server -keystore /Library/Java/JavaVirtualMachines/jdk1.8.0_181.jdk/Contents/Home/jre/lib/security/cacerts –file /Users/pvidal/Documents/customers/quest/config/tls/rootCA.pem

Step 2: Prepare Nifi

Note: This step will require a Nifi restart, so I suggest to stop Nifi before following these instructions and then start it afterwards

Add the krb5 conf file to Nifi Properties

Go to your Nifi conf folder and modify the nifi.properties file to add the following:

nifi.kerberos.krb5.file=[LOCATION_OF_YOUR_KRB5.CONF]

Load the processors and services to Nifi

Go to your Nifi lib folder and add the necessary NARs; I added the following:

-rwxr-xr-x@ 1 pvidal admin 14800 Mar 5 16:39 nifi-hive-services-api-nar-1.9.0.1.0.0.0-49.nar -rwxr-xr-x@ 1 pvidal admin 164674666 Mar 5 16:39 nifi-hive_1_1-nar-1.9.0.1.0.0.0-49.nar

Step 3: Configure Nifi

Note: Remember to restart Nifi before this step.

Configure a KeytabCredentialsService

Go to your controller services, and add a new KeytabCredentialsService.

106932-106872-screen-shot-2019-03-06-at-74429-am.png

Configure the service as such:

  • Kerberos Keytab: [LOCATION_OF_YOUR_KEYTAB]
  • Kerberos Principal: [NAME_OF_YOUR_PRINCIPAL]

Enable the service.

Configure a Hive_1_1ConnectionPool

Go to your controller services, and add a new Hive_1_1ConnectionPool (from the NAR you imported).

106892-screen-shot-2019-03-06-at-74726-am.png

Configure the service as such:

  • Database Connection URL: jdbc:hive2://[YOUR_HIVE_HOST]:10000/default;principal=hive/_HOST@[YOUR_DOMAIN,SAME AS PRINICPAL];ssl=true
  • Hive Configuration Resources: [LOCATION_OF_HIVE_SITE.XML],[LOCATION_OF_CORE_SITE.XML],[LOCATION_OF_HDFS_SITE.XML]
  • Kerberos Credentials Service: [YOUR_KEYTABCREDENTIALSSERVICE]

Enable the service.

Configure a simple flow

106941-screen-shot-2019-03-06-at-75219-am.png

I configured a simple flow that only contains:

  • A SelectHive_1_1QL (from the NAR I imported)
  • A convert Avro to JSON (to make it readable)
  • A log message

The only bit of configuration I had to do was referencing the Hive_1_1ConnectionPool I created earlier, as depicted below:

106903-screen-shot-2019-03-06-at-75233-am.png

Note: With the official release of CDF, all of this will be MUCH simpler, with no need for NAR import. If you're not excited about it, I am!


screen-shot-2019-03-06-at-74429-am.png
5,977 Views
Comments
avatar
Explorer

from where to get both the imported files?