Hi I have a JMS and that should be populated with a different docs (pdf, pics, ms docs). There are potentially millions of those docs can be pushed during the day (plus updates). I want to be able to perform a real-time search on those. The question I have is, what is the best technology to achieve this? I know for search and indexing Solr shall be used. I guess for processing data in real time Storm should be used (I also have seen Solr bolt). What I am not sure is how Storm can process free format files, index them and make available for search. Can Solr Storm bolt do this? Or do I need to use something else?
... View more