Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
avatar
Cloudera Employee

Overview - ExecuteProcess 

 When developing custom logic as part of our dataflow, having the ability to execute a custom script is a boom especially when it comes to compensating for certain capabilities which are not present out of the box or may not be the need of the hour now for the larger audience. :package:  NiFi makes this possible by providing a processor called ExecuteProcess which enables you to execute custom code. It leverages the underlying operating system to execute the command/process.

 

Background

Over the course of using this processor I often had trouble with managing different modules that were required by different ExecuteProcess processors within my flow. At times I was looking at having different versions of the same modules for different set of scripts. Virtual environments in python came to my rescue here. Hence as a best practise it is always advised to use virtual environments to setup additional modules while using Python code within the ExecuteProcess processor.

 

Let me now show you how can this be achieved. For the subsequent steps, these are the different versions I am using:

 

Python Version : 3.6.8

pip Version : 21.3.1

NiFi Version1.18.0.2.2.6.0-260

Python Module : Faker

 

Process

The whole process is divided into two stages:

  1. Changes on the system where NiFi is installed(Creation of the virtual env)
  2. Configuring the ExecuteProcess processor

 Stage 1 : Creation of the virtual env

        Step 1 : Execute the following command to create a virtual env

                  virtualenv <env-name-without-brackets>

                  Eg: virtualenv env

          *On executing this a directory with the environment name gets created(the name in our case is env) in the current directory

                 mmehra_0-1674473504888.png

 

       Step 2 : Activate the Environment and install the package using the following commands

                    source env/bin/activate

               Note : Once the virtual environment is activated you should see you environment name at the start of the command prompt.

mmehra_1-1674473592173.png

          Step 3 : Install the module using the following command

                           pip install faker

          Step 4 : Once the installation completes, create a shell script(test.sh) with the following content

                                 #!/bin/bash

                                  source /path/to/your-environment/env/bin/activate

                                  python /path/to/your/script/test.py

            My script folder looks something like this:

                   mmehra_2-1674474210257.png

 

Important : Make sure the script and the virtual environment folder are accessible by the user which is running NiFi.

 

Stage 2 : Configuring the ExecuteProcess processor

            Step 1 : In NiFi, configure the processor in the following way

mmehra_3-1674474345738.png

Your processor will now be able to import the module that you have setup within your python virtual environment

 

 

1,875 Views
Comments

Very good article, this will definitely help me in the future!