Assume HBase table with rowkey, CF_1, CF_2
CF_1 includes a determined number of "key columns"
CF_2 includes a variable number of "assignment columns"
TTL rules are set on all, on-write.
Requirement is to expire CF_1 only when all columns in CF_2 expired. They are set to expire at different times.
Is there an HBase features allows to implement this requirement in real-time like a rule-based trigger? I am not looking for a script to do it.
You might be able to implement this with a custom coprocessor, but it would likely be very challenging to get correct.
I'd probably recommend a nightly job to prune the results from your table and add some application logic to ignore such records (until your job runs again).
Most of the HBase features are cell-oriented rather than row-oriented unlike RDBMSs. For example the TTL is decided based on each individual cell, rather than a given row. Compactions (which is how HBase expires data) will also work for column families separately. They will never see the whole data for a given row.
However, you can still implement what you want with some amount of code. As Josh suggests, you can actually implement a Filter that will only return rows that match your TTL criteria. Then you can issue deletes for those rows periodically.
1. Implement a filter that will return rows that match TTL criteria
2. Daily job that will set the TTL to match the criteria for the "logical" row to expire at the compaction time which will be forced to happen at off-peak hours.