Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How can I delete Records in Hbase and SolR that is 2 hours ago?

avatar
Rising Star

Hi,

Im using NiFi here and the data flow is running every minute. Im fetching csvs from a web service and put it in hdfs.

Now every file inserted in HDFS, i need to save it to HBASE. But I need to delete rows that is over 2 hours ago. For example the time now is 11:03AM and i have a records/rows that is inserted at 9:03 AM, when the time became 11:04AM i need to delete the records that was insert at 9:03AM. This process of deleting records should also run every minute. And this deleted records also need to be deleted in SOLR+Banana UI.

Is this possible?

Thanks.

1 ACCEPTED SOLUTION

avatar
Master Mentor

Deleting rows in HBase is a heavy operation, instead of managing deletions yourself, let HBase handle it via TTL. Basically you can set expiration on a row or alternatively cell and it will be marked as deleted once time to live expires, time is in UTC. https://hbase.apache.org/book.html#ttl

Once row has a delete market it will be cleaned up by a standard compaction mechanism.

View solution in original post

3 REPLIES 3

avatar
Master Mentor

Deleting rows in HBase is a heavy operation, instead of managing deletions yourself, let HBase handle it via TTL. Basically you can set expiration on a row or alternatively cell and it will be marked as deleted once time to live expires, time is in UTC. https://hbase.apache.org/book.html#ttl

Once row has a delete market it will be cleaned up by a standard compaction mechanism.

avatar
Rising Star

Thanks sir @Artem Ervits i'll try this one.

avatar
Contributor

Solr also supports TTL, although I think if the docs are deleted from HBase they should be deleted in Solr automatically.

In case you are interested in the Solr TTL feature, it's done through an UpdateRequestProcessor (URP). It's currently only documented in Solr's Javadocs: http://lucene.apache.org/solr/6_4_0/solr-core/org/apache/solr/update/processor/DocExpirationUpdatePr... (replace the '6_4_0' part of that URL to get to the javadocs for your version; this URP has existed since Solr 4.8.0).