Support Questions
Find answers, ask questions, and share your expertise

[Apache NiFi Bug] ExecuteSQL can't perform batch loading with an input file



TC to reproduce:

1) Create GenerateFlowFile with the next Custom Text:

select 1 from dual
union all
select 2 from dual
union all
select 3 from dual

2) Create ExecuteSQL. Set "Output Batch Size" = 1.
3) Start




3 flowfiles in the "success" connection.

Root Cause:

if (outputBatchSize > 0 && resultSetFlowFiles.size() >= outputBatchSize) {
    session.transfer(resultSetFlowFiles, REL_SUCCESS);

Commit method has the next requirement:
Commits the current session ensuring all operations against FlowFiles within this session are atomically persisted. All FlowFiles operated on within this session must be accounted for by transfer or removal or the commit will fail.

At the time of any commit, an input FlowFile (fileToProcess) still exists and has not been accounted for by transfer or removal.

How can it be worked around? We have the same issue with the custom processor.


Super Guru

I'm not aware of any workaround, but I did submit a fix for the Apache Jira you wrote (NIFI-6040), it should be in an upcoming HDF release. For your custom processor, you could make the same changes I did in my Pull Request, to remove the incoming flow file before the first commit.