Reply
Explorer
Posts: 17
Registered: ‎08-28-2014

Data retention policy for hbase-solr documents with NRT

Hi,

 

I have created solr collections to create solr documents from hbase tables and enabled NRT using Key-Value store indexer service. I have populated the hbase table with data and using NRT, the solr documents are created and I could query them from solr server. All the setup is good.

 

Next, I'm working on data retention of solr documents. My expectation was, if I enable TTL on hbase table, the data purges from hbase table after x minutes and solr NRT will pickup those delta changes from hbase table and remove the corresponding solr documents. But, I found that my expectation was wrong. I don't see a cleanup happening on solr documents when hbase table data is cleared using TTL feature.

 

Please let me know how to resolve this issue (or) clarify me how to setup a data retention policy on solr documents to exactly match with HBase's TTL feature.

 

I'm using HBase Version 0.98.6-cdh5.3.0

 

Solr Versions

  • solr-spec 4.4.0-cdh5.3.0
  • solr-impl 4.4.0-cdh5.3.0 exported - jenkins - 2014-12-16 19:08:08
  • lucene-spec 4.4.0-cdh5.3.0
  • lucene-impl 4.4.0-cdh5.3.0 exported - jenkins - 2014-12-16 19:02:38

 

Thanks,

Surya

Cloudera Employee
Posts: 146
Registered: ‎08-21-2013

Re: Data retention policy for hbase-solr documents with NRT

HBase TTL feature isn't supported with hbase-indexer (because hbase doesn't send delete events via hbase replication for TTL deletes)

Wolfgang.

Explorer
Posts: 17
Registered: ‎08-28-2014

Re: Data retention policy for hbase-solr documents with NRT

Thanks for confirming Wolfgang,

 

Is this featuer enabled in any higher versions of CDH? If not, could you please help me to setup similar feature (TTL) on solr documentation? How to delete the solr documents older than xx days?

 

Thanks,

Surya

New Contributor
Posts: 1
Registered: ‎04-25-2017

Re: Data retention policy for hbase-solr documents with NRT

I have the same problem with you, so how you resolve it now?

Highlighted
Cloudera Employee
Posts: 172
Registered: ‎01-09-2014

Re: Data retention policy for hbase-solr documents with NRT

You can use the solr TTL feature to ensure doc expiration, the details are here: https://lucidworks.com/2014/05/07/document-expiration/

-pd
Announcements