Support Questions

ginsul · ‎10-24-2016

Hi,

There are a few theoretical gaps that I need to be filled. Could you please help me out with that?

I have looked at the Lily documentation provided on https://github.com/NGDATA/hbase-indexer/wiki , but I am not certain regarding a few aspects.

Consider a scenario where HBase table is constantly being added new rows. Two Lily indexers L1 and L2 then asynchronously process the added data.

1. Since I could not find WAL-related properties in Lily Indexer's configuration, I presume the indexer uses HBase WAL. Is that right?
2. How do lily indexers L1 and L2 coordinate in order not to index the same data twice?
3. What happens if one indexer crashes while indexing an HBase row. Do other indexers then reindex the data that the crashed indexer has failed to index?
4. What is the best approach to verify that all HBase records have been indexed and are present in Solr? E.g. an occasional batch job?

Thanks,

Gin

whosch · ‎10-24-2016

Here is a useful related read: http://www.ngdata.com/the-hbase-side-effect-processor-and-hbase-replication-monitoring/

View solution in original post

whosch · ‎10-24-2016

Here is a useful related read: http://www.ngdata.com/the-hbase-side-effect-processor-and-hbase-replication-monitoring/

ginsul · ‎10-25-2016

Fantastic! Thanks, whosch.

Cloudera Community

Support Questions

Lily Indexer: ensuring Hbase and Solr record consistency