Support Questions

Find answers, ask questions, and share your expertise
Celebrating as our community reaches 100,000 members! Thank you!

How Extract text from a multiline flow and create only one property with the all flow's content ?

Rising Star

Hi everybody

**** In input I've got a flow multiline like this one :

27/05/2016 06:28:34,000 ERROR [ACTIVE] ExecuteThread: '6' for queue: 'weblogic.kernel.Default (self-tuning)' Exception lors du traitement Mini site inconnu at at

**** Then I use "ExtractText" processor with multiline mode=true and with a new property grok=^(.*)$

And in ouput this property ${grok} has only the first line.

*** My question, how can I retrieve all the input lines in this property ?

Thanks for your answer.



Hi @Thierry Vernhet,

To achieve what you are looking for, I believe you must set the property "Enable DOTALL mode" to true.

Below is a template that produces the expected result with the example you gave.


Hope this helps.

View solution in original post



Hi @Thierry Vernhet,

To achieve what you are looking for, I believe you must set the property "Enable DOTALL mode" to true.

Below is a template that produces the expected result with the example you gave.


Hope this helps.

Rising Star

Hi @Pierre Villard

My input is as shown below

  • i, John, $100
  • ii, Kevin, $150
  • iii, Steve, $200

I used ExtractText processor with Enable Multiline Mode=true, Enable DOTALL Mode=true and new property line=(.*).

After execution I see below in provenance event in attributes tab

  • line i, John, $100 ii, Kevin, $150 iii, Steve, $200
  • line.0 i, John, $100 ii, Kevin, $150 iii, Steve, $200
  • line.1 i, John, $100 ii, Kevin, $150 iii, Steve, $200

Expected output

  • line i, John, $100
  • line.0 ii, Kevin, $150
  • line.1 iii, Steve, $200

Please suggest.

Rising Star

@Pierre Villard It's OK, Thanks a lot.

Master Mentor

Keep in mind that FlowFile Attributes live in memory. Loading a FlowFile Attribute with the entire content of the file is going to have an impact on heap usage in your flow. That being said, there are two things to consider when building dataflows like this: 1. Increasing the the size of the available heap for the NiFi application. Heap space thresholds for NiFi are configured in the bootstrap.conf file and by default are very small (512 MB).

# JVM memory settings



2. You must take in to consideration the data volumes you will be working with in the particular dataflow. To help prevent out of memory error in NiFi, we have established a threshold on how much data can queue on a connection before FlowFile's attributes are swapped out of heap to disk. The default configuration in the file is 20,000. ( nifi.queue.swap.threshold=20000 ) this is per connection not per flow. So if the FlowFiles you extracted content in begin to queue on numerous connections, you run the risk of hitting the out of memory condition quicker. You can decrease this value so swapping happens sooner, but that will in turn have an impact on performance.

I would start with increasing the heap memory for your NiFi and the go from there.

Rising Star

Hi @mclark

Thanks. Before ExtractText we use Tailfile. So every flow contains only a few records of the entire file.