There are a few theoretical gaps that I need to be filled. Could you please help me out with that?
I have looked at the Lily documentation provided on https://github.com/NGDATA/hbase-indexer/wiki , but I am not certain regarding a few aspects.
Consider a scenario where HBase table is constantly being added new rows. Two Lily indexers L1 and L2 then asynchronously process the added data.
1. Since I could not find WAL-related properties in Lily Indexer's configuration, I presume the indexer uses HBase WAL. Is that right?
2. How do lily indexers L1 and L2 coordinate in order not to index the same data twice?
3. What happens if one indexer crashes while indexing an HBase row. Do other indexers then reindex the data that the crashed indexer has failed to index?
4. What is the best approach to verify that all HBase records have been indexed and are present in Solr? E.g. an occasional batch job?