I have met this oom issue for solrcloud for a long time.
Currently the JVM of solrcloud is 20GB. If only query(read) functionality works, no OOM happen. And if only build index(write) functionality works, no OOM happen either. However, when both of the read and write work, OOM happen again.
What 's the problem for that?
I have analysised the JVM dump file when OOM happened, 95% of the JVM has been used by org.apache.lucene.codecs.blocktree.SegmentTermsEnumFrame.
Any ideas about that? We should separate the read and write to differenct servers?
Thanks a lot.
Have you tried increasing the JVM heap to 30GB? This may resolve the issue in the near term, but you still likely need to add more servers to the cluster.
How many servers are currently in your cloud configuration? If your query loads and index loads work ok individually but cause OOM issues when running simultaneously, then a common solution is to add more nodes to your cluster. Adding more nodes will spread the query and index loads across the servers reducing the overall memory pressure on each server. Having said that, you may need to consider how large your indexes are, how many shards you have, what indexing options you are using, etc.
There are a number of factors that can cause heap space issues. This link may be helpful to you: https://wiki.apache.org/solr/SolrPerformanceProblems