New Contributor
Posts: 1
Registered: ‎06-26-2019

Apache tika architecture & processing nodes

[ Edited ]


I have just started exploring Apache Tika.

I want to check how apache tika back ground processes work.

Example: I have a 200 page pdf content and use Tika to extract the text or features.


Will Tika execute this process using a single node (i.e considering one file as one block) or will it execute using multiple nodes?


Im just comparing Tika process to Mapreduce and learn if Tika is also processing a file block by block.


Please help me to understand this background processes.