Support Questions

avijeetd · ‎12-12-2016

Hi,

I have a fundamental question related to these storage engine (HIVE/HBASE/SOLR) options on HDFS

Q1. If we ingest data to HDFS and then build SOLR index - is it same as directly ingesting the data into SOLR? In terms of storage usage and lay out.

Q2. Is there any approach on how to ingest once and may be have all 3 different data-access options optimally to use - HIVE for faster scan, HBASE gives for bulk retrieval, SOLR for record level search.

Thanks,

Avijeet

mjohansson · ‎12-12-2016

Hi @Avijeet Dash,

The Solr index requires persistent storage as well.

There are several options to read Hbase from Hive and Solr from Hive and they all include storage handlers and SerDes such as https://github.com/lucidworks/hive-solr and https://github.com/chimpler/hive-solr.

Also for Hive/Hbase integration there is https://cwiki.apache.org/confluence/display/Hive/StorageHandlers

Hope this helps.

/Best regards, Mats

View solution in original post

mjohansson · ‎12-12-2016