06-26-2019 02:24 AM - last edited on 06-26-2019 06:21 AM by cjervis
I have just started exploring Apache Tika.
I want to check how apache tika back ground processes work.
Example: I have a 200 page pdf content and use Tika to extract the text or features.
Will Tika execute this process using a single node (i.e considering one file as one block) or will it execute using multiple nodes?
Im just comparing Tika process to Mapreduce and learn if Tika is also processing a file block by block.
Please help me to understand this background processes.