Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Apache tika architecture & processing nodes


Apache tika architecture & processing nodes

New Contributor


I have just started exploring Apache Tika.

I want to check how apache tika back ground processes work.

Example: I have a 200 page pdf content and use Tika to extract the text or features.


Will Tika execute this process using a single node (i.e considering one file as one block) or will it execute using multiple nodes?


Im just comparing Tika process to Mapreduce and learn if Tika is also processing a file block by block.


Please help me to understand this background processes.