Support Questions

Find answers, ask questions, and share your expertise

how to install python libs in the python engine of the "ExecuteScript" processor?

avatar
Expert Contributor

hi cloudera community

 

i am using the "ExecuteScript" processor and executing python code using the "python" engine (python 2.7 (jython 2.7.3))

 

and when running the code, I get the error below:

"org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: ImportError: No module named <name_module> in <script> at line number 3
- Caused by: javax.script.ScriptException: ImportError: No module named <name_module> in <script> at line number 3
- Caused by: Traceback (most recent call last):
File "<script>", line 3, in <module>
ImportError: No module named <name_module>"

 

i would like to know how to install libs in this python that the processor is using?

4 REPLIES 4

avatar
Super Guru

Hi ,

Have you looked into the Module Directory property. I have not tried it honestly but see if you can save your packages to a location visible to Nifi and if you have a cluster you need to make sure this path and the required modules are accessible for all the Nodes.

Also I would recommend that you upgrade if you can to Nifi 2.0 latest release since Python is more integrated into the Nifi Framework and you can design your custom processors easily and the nifi will take care of downloading all required packages for you so that you dont have to worry about that. I would recommend you refer to the following to find more about using python extensions with Nifi 2.0 :

https://nifi.apache.org/documentation/nifi-2.0.0-M2/html/python-developer-guide.html

https://www.youtube.com/watch?v=9Oi_6nFmbPg&t=580s

Also keep in mind the support for python\jython script has been deprecated from the ExecuteProcessor Post 2.0 . So my recommendation is either upgrade to take advantage of the python extension or try to use groovy instead since its still supported  so that you can avoid the headache of having to re write all your python scripts when you are to upgrade to version 2.0 or higher.

If this helps please accept solution.

Thanks

 

avatar
Expert Contributor

hi @SAMSAL ,

since there are many libs that we need to use in Python, we need to download these libs with pip install before running etc.

so, we would like to know how to do this "pip install" of the Python libs for this Python processor engine to use correctly.

avatar
Super Guru

If you are talking about python processor engine in 2.0  then basically it uses virtual environment library (venv ) which you should have it installed to you PYTHON_HOME path before enabling the python extension from the nifi.properties file ( see referenced guide).

When you deploy your python processor under  {NIFI_PATH}/python/extensions and restart nifi , the nifi framework will process this file , move it to its own directory under {NIFI_PATH}/work/python then install python venv which will create  a python isolated environment for that processor which will have its own python commands including the Pip command. Those python commands will be add to the bin (or script for windows)  under the processor folder which will be used to download the dependency packages to that environment as well.  Regarding defining the dependency packages, if you follow any existing python processor file template you will find them defined in the dependencies tag under ProcessorDetails class as follows:

 

 class ProcessorDetails:
        version = "2.0.0-SNAPSHOT"
        description = """ Some Description"""
        tags = ["excel", "json", "convert"]
        dependencies = ['pandas','numpy','openpyxl',"xlrd"]

 

Once the venv is created , the python engine starts downloading those packages automatically and no interference is required. You can track the download progress in the app-nifi.log file. you can know that the dependency download is complete by either checking the processor status itself in the nifi canvas where all defined properties and relationships are listed, or you can look for the file ""dependency-download.complete" " under the processor folder where the packages are getting installed. If the file has the value of True then it means the dependencies have been downloaded successfully and the processor is ready to use.

Hope that helps.

 

 

avatar
Expert Contributor

hi @SAMSAL ,

i will test this process and return with more updates.