Community Articles

Find and share helpful community-sourced technical articles.
Labels (1)
Super Guru

Updating The Apache OpenNLP Community Apache NiFi Processor to Support Flow Files

In this new release, we add the ability to read content from the FlowFile and analyze that for Locations, Dates, Organizations and Names. We are using the Apache OpenNLP 1.5 Models that are available for download. These do a decent job. You can build new models as needed. I also changed it to output one attribute per type with a String list of locations, organizations, dates and names.

I put out a new release, built around Apache NiFi 1.6.0.

Source and NAR Download

Download the Pre-trained Models for Your Language Here:

I chose English (en).

In a future release I made add Organization, Money, Time and Percentage to the lists we extract if there is interest.

A Final JSON File Produced


Example Output


The Main Flow For Trying Out The NLP Processor


Set Your Models


New NLP Processor Documentation76403-nlpdocs.png

Here is the schema to use to process this data. Not nlp_names is a String of comma delimited values. You may want to parse this or do additional processing in these fields.


High Level Flow


Example NiFi Flow




One thing we are missing is language detection, may be using Apache Tika or Apache OpenNLP to try that.

Also we should probably add attributes to let you exactly specify the models for Organization, Location, Name, Dates.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎08-17-2019 07:21 AM
Updated by:
Top Kudoed Authors