Support Questions

Find answers, ask questions, and share your expertise

Listing S3 Bucket files using python in NiFi ExecuteScript Processor

avatar
Rising Star

Hi, Since we can mention only one prefix in ListS3 processor I am trying to access AWS S3 using Python boto3 in NiFi ExecuteScript processor. If this succeeds, I can send a list of folder paths to the python script to get files from various folders under S3 bucket. I provided the path of boto3-1.6.0.tar.gz in ModuleDirectory but I get the below error.

@Matt Burgess @Bryan Bende @Matt Foley Please let me know how to fix this and achieve use case.

ExecuteScript Processor :

64411-executescript-python-s3.jpg

Error:

ExecuteScript[id=d6cf51e8-0161-1000-32de-7748af781842] Failed to process session due to org.apache.nifi.processor.exception.ProcessException: javax.script.ScriptException: ImportError: No module named boto3 in <script> at line number 1: javax.script.ScriptException: ImportError: No module named boto3 in <script> at line number 1
3 REPLIES 3

avatar
Explorer

Set your module direct to something like this to pick up all the python modules, /usr/local/lib/python2.7/site-packages,/usr/lib/python2.7/dist-packages

avatar
New Contributor

Has anyone validated if you can access boto3 from Apache NiFi?

avatar
Expert Contributor

https://lists.apache.org/thread/907n11xlvdmckp1045bspzloclthfqsh

 

As NiFi is a pure Java/JVM application, we use Jython rather than
    Python for ExecuteScript. This means that you can't import native
    (CPython, e.g.) modules into your Jython scripts in ExecuteScript
consider using ExecuteStreamCommand with a real Python
    interpreter and script. I'm looking at Py4J to try and bridge the gap