Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

fast search 150 gb of oracle data using solr vs spark

Highlighted

fast search 150 gb of oracle data using solr vs spark

New Contributor

We have 150 gb of  structured data in oracle . We are transfering the data to hdfs and   doing  solr index  to that data . using the  solr api  we are proving interfaces  to  front end application for  quering the  records .This process is  very lengthy and time consuming .if any  records  changes   we have to do a  index  again.

 

We  are looking for laternative  using   spark .

 

Any one have any suggestions on  this .Our goal  is  make a   fast  query    results .

 

 

 

 

1 REPLY 1

Re: fast search 150 gb of oracle data using solr vs spark

Contributor

Updating records shouln't be that lengthy. Iif you use the rest API to update records you should be fine for small batches. For larger updates using morphlines with the batch indexer works like a charm. If you make sure you insert/update the new records using the same unique "id" field, they get overwritten with the new data.

 

Update via REST:

curl 'http://localhost:8983/solr/update/json?commit=true' --data-binary @books.json -H 'Content-type:application/json'
Don't have an account?
Coming from Hortonworks? Activate your account here