Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Python Script using ExecuteStreamCommand

Python Script using ExecuteStreamCommand

Hello Hortonworks!

After doing my best to find previous questions and examples relevant to this question, and still not finding the answers that I'm looking for I figured that I would submit a question myself.

ExecuteStreamCommand seems like the perfect processor for me due to the following reasons:

  • I am able to execute any Python script and avoid Jython (in a similar fashion as ExecuteProcess). Jython is not an option for me.
  • I can take in FlowFiles. This is necessary as my script is made to consume the output of a previous processor. Furthermore I like the idea of keeping the data under "NiFi management".
  • It writes an "execution status" which will be useful for routing.

In a nutshell, what I'm trying to do with ExecuteStreamCommand is:

  1. Ingest the output of a previous processor (a Scrapy spider that outputs a text file with JSON lines to be exact)
  2. Call a python script (`python3`)
  3. Load the FlowFile that was ingested in my python script.
  4. Select the content of the FlowFile.
  5. Operate on the content of the FlowFile within python.
  6. Output either an updated version of the original FlowFile or create a new one.
  7. Continue with my NiFi flow with the updated/new FlowFile.

I currently don't understand:

  • How to call the python script (from the ExecuteStreamCommand Processor)
  • How to load up the FlowFile from within Python
  • How to update or create a new FlowFile
  • How to output the updated FlowFile from Python back to NiFi.

I have come across various examples for ExecuteScript, but unfortunately these don't exactly translate to the use of the ExecuteStreamCommand.

Thank you in advance. Any advice is appreciated.

Don't have an account?
Coming from Hortonworks? Activate your account here