Created 04-10-2017 03:54 PM
I have different collections for different hbase tables and all the collections are indexed when I am defining one collection in morphline.conf file, but I am not sure how to define multiple collections in morphline.conf as cloudera manager uses single morphline.conf file so I am able to declare only one collection in the morphliine.conf file.Below is the morphline.conf file example.
I have tried couple of things adding another SolR_LOCATOR2 at the end .
declaring multiple collections in single collection separated by , eg --collection : demoTable1_collection,demoTable2_collection but it didnt work
SOLR_LOCATOR : {
# Name of solr collection
collection : demoTable2_collection
# ZooKeeper ensemble
zkHost : "$ZK_HOST"
}
Created on 04-10-2017 04:32 PM - edited 04-10-2017 04:34 PM
The 'morphlineId' property for the morphlineSolrSink would allow you to specify different morphlines within the same morphline.conf to be used for multiple collections. Are you using separate sinks for the respective collections?
To clarify, the above entry would be for flume, what are you using to index the data?
-pd
Created 04-10-2017 07:10 PM
I am processing the data through spark and inserting into Hbase and indexing using solr.So I have different Hbase tables to be indexed. I am creating one collection for each table and able to index when using the collection name and fields in the morphline.conf file but I am not getting how can I add multiples collections. Do I have to add different morphlinesID to the same morphline.conf file .
Below is the morphline.conf Do I have to add multiple mophlineID ?
SOLR_LOCATOR2 : {
# Name of solr collection
collection : demoTable2_collection
# ZooKeeper ensemble
zkHost : "$ZK_HOST"
}
morphlines : [
{
id : morphline
importCommands : ["org.kitesdk.**", "com.ngdata.**"]
commands : [
{
extractHBaseCells {
mappings :
Created 04-11-2017 09:58 AM