Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Release SOLR JVM memory after index delete

avatar
Explorer

Hi,

 

My use case is zero-down time while batch indexing. To achieve this, I plan to create a new index everyday in batch using MR indexer tool, while the live index serves the queries. After the new index is built I will use collection alias to switch. After the switch I will delete the stale index. 

 

Now the concern I have is - After I delete an index (huge index. Around 320 mill documents. Distributed in 64 shards spread over 4 solr instances) the solr JVM memory is not released. While the index is live, the solr jvm shows around 10g occupied. It remains the same even after I delete the index. If the solr service is restarted, the memory is released. My understanding is that solr uses its heap to store different caches for query speedup. But I need to figure out a way to clear this cache (for a particular index) when I delete the index, or else I will soon face OOM errors. Please can someone help?

 

4 solr instances

Solr Java heap size - 20 gb

Direct Memory Allocation -  4gb

 

Below are the HDFSDirectoryFactory settings in solrconfig.xml

 

<directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:org.apache.solr.core.HdfsDirectoryFactory}">
<str name="solr.hdfs.home">${solr.hdfs.home:}</str>
<str name="solr.hdfs.confdir">${solr.hdfs.confdir:}</str>
<str name="solr.hdfs.security.kerberos.enabled">${solr.hdfs.security.kerberos.enabled:false}</str>
<str name="solr.hdfs.security.kerberos.keytabfile">${solr.hdfs.security.kerberos.keytabfile:}</str>
<str name="solr.hdfs.security.kerberos.principal">${solr.hdfs.security.kerberos.principal:}</str>
<bool name="solr.hdfs.blockcache.enabled">${solr.hdfs.blockcache.enabled:true}</bool>
<str name="solr.hdfs.blockcache.global">${solr.hdfs.blockcache.global:true}</str>
<int name="solr.hdfs.blockcache.slab.count">${solr.hdfs.blockcache.slab.count:1}</int>
<bool name="solr.hdfs.blockcache.direct.memory.allocation">${solr.hdfs.blockcache.direct.memory.allocation:true}</bool>
<int name="solr.hdfs.blockcache.blocksperbank">${solr.hdfs.blockcache.blocksperbank:16384}</int>
<bool name="solr.hdfs.blockcache.read.enabled">${solr.hdfs.blockcache.read.enabled:true}</bool>
<bool name="solr.hdfs.blockcache.write.enabled">${solr.hdfs.blockcache.write.enabled:false}</bool>
<int name="solr.hdfs.blockcache.bufferstore.buffercount">${solr.hdfs.blockcache.bufferstore.buffercount:0}</int>
<bool name="solr.hdfs.nrtcachingdirectory.enable">${solr.hdfs.nrtcachingdirectory.enable:true}</bool>
<int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">${solr.hdfs.nrtcachingdirectory.maxmergesizemb:16}</int>
<int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">${solr.hdfs.nrtcachingdirectory.maxcachedmb:192}</int>
</directoryFactory>

 

Regards,

Gaurang

1 ACCEPTED SOLUTION

avatar
Super Collaborator

First of all we are using the exact same pattern (create a new index, indexing in it and then switch the alias).

But I think your query is not really related to this but more on how Solr behave (with its JVM).

 

Like this I'm not sure there is a particular problem. I have seen this behavior with a lot of other product. The JVM tend to not release the memory (until the garbage collector reclame it - because there is a need to re-use the memory).

 

If you don't have an answer here, you will want to try to fine tune the GC parameters (This is a real science). But I guess the support might help you also.

 

Best luck !

mathieu

View solution in original post

4 REPLIES 4

avatar
Super Collaborator

First of all we are using the exact same pattern (create a new index, indexing in it and then switch the alias).

But I think your query is not really related to this but more on how Solr behave (with its JVM).

 

Like this I'm not sure there is a particular problem. I have seen this behavior with a lot of other product. The JVM tend to not release the memory (until the garbage collector reclame it - because there is a need to re-use the memory).

 

If you don't have an answer here, you will want to try to fine tune the GC parameters (This is a real science). But I guess the support might help you also.

 

Best luck !

mathieu

avatar
Explorer
Thanks Mathieu!

avatar
Explorer

Hi Matheiu -

 

You were right, it was related to Garbage Collection. The moment I loaded another big index, the JVM memory went above 70% and GC was initiated.

 

I was just being pre-maturely concerned about this. 

 

Thanks Again!

 

Gaurang

avatar
Super Collaborator

That's great to know.

 

best regards.