Code Repositories

Find and share code repositories
Welcome to the upgraded Community! Read this blog to see What’s New!
Labels (1)
Super Guru
Repo Description

NiFi Custom Processor for Extracting Text from Documents via Apache Tika

This processor will extract the raw text from PDF, Word, HTML, XML, Excel, Powerpoint and other formats supported by Apache Tika.

Repo Info
Github Repo URL
Github account name tspannhw
Repo name nifi-extracttext-processor

Does that mean that Nifi has built in Apache Tika into it or should we install Apache Tika externally

If you look in my processor, I am including Tika

There are other processors using Tika in Apache NiFi.