Support Questions

Find answers, ask questions, and share your expertise

Release SOLR JVM memory after index delete

avatar
Explorer

Hi,

 

My use case is zero-down time while batch indexing. To achieve this, I plan to create a new index everyday in batch using MR indexer tool, while the live index serves the queries. After the new index is built I will use collection alias to switch. After the switch I will delete the stale index. 

 

Now the concern I have is - After I delete an index (huge index. Around 320 mill documents. Distributed in 64 shards spread over 4 solr instances) the solr JVM memory is not released. While the index is live, the solr jvm shows around 10g occupied. It remains the same even after I delete the index. If the solr service is restarted, the memory is released. My understanding is that solr uses its heap to store different caches for query speedup. But I need to figure out a way to clear this cache (for a particular index) when I delete the index, or else I will soon face OOM errors. Please can someone help?

 

4 solr instances

Solr Java heap size - 20 gb

Direct Memory Allocation -  4gb

 

Below are the HDFSDirectoryFactory settings in solrconfig.xml

 

<directoryFactory name="DirectoryFactory" class="${solr.directoryFactory:org.apache.solr.core.HdfsDirectoryFactory}">
<str name="solr.hdfs.home">${solr.hdfs.home:}</str>
<str name="solr.hdfs.confdir">${solr.hdfs.confdir:}</str>
<str name="solr.hdfs.security.kerberos.enabled">${solr.hdfs.security.kerberos.enabled:false}</str>
<str name="solr.hdfs.security.kerberos.keytabfile">${solr.hdfs.security.kerberos.keytabfile:}</str>
<str name="solr.hdfs.security.kerberos.principal">${solr.hdfs.security.kerberos.principal:}</str>
<bool name="solr.hdfs.blockcache.enabled">${solr.hdfs.blockcache.enabled:true}</bool>
<str name="solr.hdfs.blockcache.global">${solr.hdfs.blockcache.global:true}</str>
<int name="solr.hdfs.blockcache.slab.count">${solr.hdfs.blockcache.slab.count:1}</int>
<bool name="solr.hdfs.blockcache.direct.memory.allocation">${solr.hdfs.blockcache.direct.memory.allocation:true}</bool>
<int name="solr.hdfs.blockcache.blocksperbank">${solr.hdfs.blockcache.blocksperbank:16384}</int>
<bool name="solr.hdfs.blockcache.read.enabled">${solr.hdfs.blockcache.read.enabled:true}</bool>
<bool name="solr.hdfs.blockcache.write.enabled">${solr.hdfs.blockcache.write.enabled:false}</bool>
<int name="solr.hdfs.blockcache.bufferstore.buffercount">${solr.hdfs.blockcache.bufferstore.buffercount:0}</int>
<bool name="solr.hdfs.nrtcachingdirectory.enable">${solr.hdfs.nrtcachingdirectory.enable:true}</bool>
<int name="solr.hdfs.nrtcachingdirectory.maxmergesizemb">${solr.hdfs.nrtcachingdirectory.maxmergesizemb:16}</int>
<int name="solr.hdfs.nrtcachingdirectory.maxcachedmb">${solr.hdfs.nrtcachingdirectory.maxcachedmb:192}</int>
</directoryFactory>

 

Regards,

Gaurang

1 ACCEPTED SOLUTION

avatar
Super Collaborator

First of all we are using the exact same pattern (create a new index, indexing in it and then switch the alias).

But I think your query is not really related to this but more on how Solr behave (with its JVM).

 

Like this I'm not sure there is a particular problem. I have seen this behavior with a lot of other product. The JVM tend to not release the memory (until the garbage collector reclame it - because there is a need to re-use the memory).

 

If you don't have an answer here, you will want to try to fine tune the GC parameters (This is a real science). But I guess the support might help you also.

 

Best luck !

mathieu

View solution in original post

4 REPLIES 4

avatar
Super Collaborator

First of all we are using the exact same pattern (create a new index, indexing in it and then switch the alias).

But I think your query is not really related to this but more on how Solr behave (with its JVM).

 

Like this I'm not sure there is a particular problem. I have seen this behavior with a lot of other product. The JVM tend to not release the memory (until the garbage collector reclame it - because there is a need to re-use the memory).

 

If you don't have an answer here, you will want to try to fine tune the GC parameters (This is a real science). But I guess the support might help you also.

 

Best luck !

mathieu

avatar
Explorer
Thanks Mathieu!

avatar
Explorer

Hi Matheiu -

 

You were right, it was related to Garbage Collection. The moment I loaded another big index, the JVM memory went above 70% and GC was initiated.

 

I was just being pre-maturely concerned about this. 

 

Thanks Again!

 

Gaurang

avatar
Super Collaborator

That's great to know.

 

best regards.