Support Questions

Find answers, ask questions, and share your expertise

How to generate multiple flowfiles for each stdout in ExecuteProcess processor?

avatar
New Contributor

Hi,

I am using a custom python script (image attached below) in ExecuteProcess processor and it generates flowfiles as result of stdout. The problem I am facing is that for all stdouts, it generates only one flowfile. My requirement is to generate seprate flowfile for each stdout.

I tried putting a pattern in stdout and feeding the output flowfile to SplitContent as input. But I keep getting "A flowfile is currently penalized and the data cannot be processed at this time" and it doesn't resolve even after waiting for so long. SplitContent doesn't show any error either, it just doesn't process the splitting.

Trying to split the flowfiles based on pattern - "#;#" [PFA the images below]

I would really appreciate any help! 

PS.: Also tried to use ExecuteScript to run custom python script, but I guess since it works on jython, it didn't allow me to import some python packages.

image.png

abhishek07_0-1657346175641.png

abhishek07_1-1657346205758.png

 

1 ACCEPTED SOLUTION

avatar
Super Guru

Hi,

 

I tried the same scenario using the split content and the code you provided above and it worked for me where Im getting two json records . Im using version 1.16.0

 

SAMSAL_4-1657373645593.png

 

 

ExecuteProcess Configuration:

SAMSAL_2-1657373467398.png

SplitContent Configurations:

SAMSAL_3-1657373520343.png

 

View solution in original post

3 REPLIES 3

avatar
Super Guru

Hi,

 

I tried the same scenario using the split content and the code you provided above and it worked for me where Im getting two json records . Im using version 1.16.0

 

SAMSAL_4-1657373645593.png

 

 

ExecuteProcess Configuration:

SAMSAL_2-1657373467398.png

SplitContent Configurations:

SAMSAL_3-1657373520343.png

 

avatar
New Contributor

Hi @SAMSAL ,
Firstly, thank you for your response! I keep getting this 'A flowfile is currently penalized and the data cannot be processed at this time' when I hover over the connector between ExecuteProcess and SplitContent, and I guess that's why the split is not working for me. 

Any idea what I might be doing wrong? Or any setting I need to check? I am using nifi-1.16.2

Really appreciate your help!

abhishek07_0-1657387051163.png

 

avatar
Super Guru

Hi,

If your configurations are similar to mine then I would suggest that you test it on version 1.16.0 to see if its a bug with version  1.16.2 which in this case needs to be reported. Another option for you is to write custom code using ExecuteScript processor that will generate a different flowfile  for each json record since the ExecuteProcess wont have this option and would write everything in the output steam into one flowfile. you can refer to the following post to help you generate multiple flowfiles:

https://community.cloudera.com/t5/Support-Questions/Split-one-Nifi-flow-file-into-Multiple-flow-file...

Another option you can try incase there is a problem with the SplitContent processor is to create every json record in a different line and then try to split them using SplitText processor and set the property "Split Line Count" to 1