Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Apache tika architecture & processing nodes

Highlighted

Apache tika architecture & processing nodes

New Contributor

Hi,

I have just started exploring Apache Tika.

I want to check how apache tika back ground processes work.

Example: I have a 200 page pdf content and use Tika to extract the text or features.

 

Will Tika execute this process using a single node (i.e considering one file as one block) or will it execute using multiple nodes?

 

Im just comparing Tika process to Mapreduce and learn if Tika is also processing a file block by block.

 

Please help me to understand this background processes.