Created on 05-23-201808:15 PM - edited 08-17-201907:21 AM
Updating The Apache OpenNLP Community Apache NiFi Processor to Support Flow Files
In this new release, we add the ability to read content from the FlowFile and analyze that for Locations, Dates, Organizations and Names. We are using the Apache OpenNLP 1.5 Models that are available for download. These do a decent job. You can build new models as needed. I also changed it to output one attribute per type with a String list of locations, organizations, dates and names.
I put out a new release, built around Apache NiFi 1.6.0.
In a future release I made add Organization, Money, Time and Percentage to the lists we extract if there is interest.
A Final JSON File Produced
Example Output
The Main Flow For Trying Out The NLP Processor
Set Your Models
New NLP Processor Documentation
Here is the schema to use to process this data. Not nlp_names is a String of comma delimited values. You may want to parse this or do additional processing in these fields.