Created on 02-06-2017 08:16 PM
NiFi Custom Processor for Extracting Text from Documents via Apache Tika
This processor will extract the raw text from PDF, Word, HTML, XML, Excel, Powerpoint and other formats supported by Apache Tika.
Created on 07-19-2017 10:28 PM
Does that mean that Nifi has built in Apache Tika into it or should we install Apache Tika externally
Created on 11-17-2017 09:39 PM
If you look in my processor, I am including Tika
https://github.com/tspannhw/nifi-extracttext-processor/blob/master/nifi-extracttext-processors/pom.x...
There are other processors using Tika in Apache NiFi.