Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to add additional libraries in ExecuteScript Processor for Python

SOLVED Go to solution
Highlighted

How to add additional libraries in ExecuteScript Processor for Python

New Contributor

Greetings Community ,

I am working on a python scipt , which uses http post request to get data .

I have used invokehttp processor and worked hard to get data using it , but later figured out that there are some authentication libraries by the API developer . So I need to execute the specific python code provided by API developer to get data.

Now , I am trying to run the python code with some additional libraries .

I have taken reference of below link where in groovy additional libraries are hosted in local directory and path is mentioned in MODULE DIRECTORY in ExecuteScript Processor.

https://community.hortonworks.com/questions/47493/nifi-executescript-using-external-libarries-with-g...

I am doing the same by specifiying the python libraries location (folder location) using comma separated

Tried forward and back slash as well while specifying folder location , even tried specifying the main python file ( __init.py__) but for all these attempts the error is as follows

No Module named xxx NOT FOUND in line XXX . Here is the what I mentioned in Module Directory in ExectuteScript Processor.

C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\site-packages\requests\__init__.py,C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\json\__init__.py,C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\site-packages\requests_oauthlib\oauth1_auth.py,C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\site-packages\requests_oauthlib\oauth1_session.py

1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to add additional libraries in ExecuteScript Processor for Python

When using the Jython script engine in ExecuteScript, the Module Directory property works somewhat like the standard Python module system. I believe you'll want to have the location(s) specified as the directories containing the scripts, not the scripts themselves. Each entry in Module Directory is added to the script engine via a "sys.path.append()" call, so if you can get the modules to load for a regular python script (using sys.path.append() with full path names), then you can use that same list for Module Directory. I'd try:

C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\site-packages\requests,C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\json,C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\site-packages\requests_oauthlib

or whichever variations to avoid the drive letter, backslashes, etc.

In this unit test you can see the Module Directory set to a test resources folder, the test uses "from callbacks import ReadFirstLine", and there are .py files in the "callbacks" folder under the test resources folder specified as Module Directory. Hopefully you can use a similar approach to have the script engine find the folders/locations and import the corresponding code.

5 REPLIES 5

Re: How to add additional libraries in ExecuteScript Processor for Python

When using the Jython script engine in ExecuteScript, the Module Directory property works somewhat like the standard Python module system. I believe you'll want to have the location(s) specified as the directories containing the scripts, not the scripts themselves. Each entry in Module Directory is added to the script engine via a "sys.path.append()" call, so if you can get the modules to load for a regular python script (using sys.path.append() with full path names), then you can use that same list for Module Directory. I'd try:

C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\site-packages\requests,C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\json,C:\Users\pjalla\AppData\Local\Programs\Python\Python35-32\Lib\site-packages\requests_oauthlib

or whichever variations to avoid the drive letter, backslashes, etc.

In this unit test you can see the Module Directory set to a test resources folder, the test uses "from callbacks import ReadFirstLine", and there are .py files in the "callbacks" folder under the test resources folder specified as Module Directory. Hopefully you can use a similar approach to have the script engine find the folders/locations and import the corresponding code.

Re: How to add additional libraries in ExecuteScript Processor for Python

New Contributor

Thanks for reply Matt , I carefully read your reply , I only find python execute engine but not jython processor in both ExectuteScript and InvokeScriptedProcessor.

Re: How to add additional libraries in ExecuteScript Processor for Python

ExecuteScript uses the name provided by the script engine to display to the user. The Jython script engine chose "python" as the language name, so that is the correct choice. One major difference is that with Jython you can only include pure Python (.py) modules, not natively-compiled modules like numpy/scipy.

Re: How to add additional libraries in ExecuteScript Processor for Python

New Contributor

@Matt Burgess

Is there a way of either adding non-pure Python modules or otherwise force the script to run under a predefined environment (e.g. a local anaconda environment)?

Re: How to add additional libraries in ExecuteScript Processor for Python

No, you'd have to use ExecuteStreamCommand or ExecuteProcess for things like Anaconda environments, non-pure (CPython) modules, etc.