Created on 01-23-202303:51 AM - edited 01-23-202303:52 AM
Overview - ExecuteProcess
When developing custom logic as part of our dataflow, having the ability to execute a custom script is a boom especially when it comes to compensating for certain capabilities which are not present out of the box or may not be the need of the hour now for the larger audience. :package: NiFi makes this possible by providing a processor called ExecuteProcess which enables you to execute custom code. It leverages the underlying operating system to execute the command/process.
Background
Over the course of using this processor I often had trouble with managing different modules that were required by different ExecuteProcess processors within my flow. At times I was looking at having different versions of the same modules for different set of scripts. Virtual environments in python came to my rescue here. Hence as a best practise it is always advised to use virtual environments to setup additional modules while using Python code within the ExecuteProcess processor.
Let me now show you how can this be achieved. For the subsequent steps, these are the different versions I am using:
Python Version : 3.6.8
pip Version : 21.3.1
NiFi Version : 1.18.0.2.2.6.0-260
Python Module : Faker
Process
The whole process is divided into two stages:
Changes on the system where NiFi is installed(Creation of the virtual env)
Configuring the ExecuteProcess processor
Stage 1 : Creation of the virtual env
Step 1 : Execute the following command to create a virtual env
virtualenv <env-name-without-brackets>
Eg: virtualenv env
*On executing this a directory with the environment name gets created(the name in our case is env) in the current directory
Step 2 : Activate the Environment and install the package using the following commands
source env/bin/activate
Note : Once the virtual environment is activated you should see you environment name at the start of the command prompt.
Step 3 : Install the module using the following command
pip install faker
Step 4 : Once the installation completes, create a shell script(test.sh) with the following content
#!/bin/bash
source /path/to/your-environment/env/bin/activate
python /path/to/your/script/test.py
My script folder looks something like this:
Important : Make sure the script and the virtual environment folder are accessible by the user which is running NiFi.
Stage 2 : Configuring the ExecuteProcess processor
Step 1 : In NiFi, configure the processor in the following way
Your processor will now be able to import the module that you have setup within your python virtual environment