Member since
08-21-2013
146
Posts
25
Kudos Received
34
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2501 | 10-24-2016 10:43 AM | |
5315 | 03-13-2016 02:15 PM | |
2956 | 12-11-2015 01:48 AM | |
2317 | 11-23-2015 12:11 PM | |
2161 | 07-06-2015 10:40 AM |
05-23-2017
04:06 PM
Custom morphline commands can maintain state if you need to, so, in principle, it is possible.
... View more
02-13-2017
09:26 AM
The log file will be on the remote hosts that ran the map tasks, not on the host that started the map reduce driver. Wolfgang
... View more
10-24-2016
10:43 AM
Here is a useful related read: http://www.ngdata.com/the-hbase-side-effect-processor-and-hbase-replication-monitoring/
... View more
03-13-2016
02:15 PM
1 Kudo
Looks like you are missing a loadSolr command in your morphline, for example as shown here: see http://www.cloudera.com/documentation/enterprise/latest/topics/search_batch_index_use_mapreduce.html?scroll=csug_topic_4_3 (FYI, with MapReduceIndexerTool the SOLR_LOCATOR is substituted from whatever is specified on the CLI with --zk-host option)
... View more
12-11-2015
01:48 AM
On yarn the params are called mapreduce.map.java.opts and mapreduce.reduce.java.opts. Wolfgang.
... View more
11-23-2015
11:00 PM
Custom morphline commands are deployed by adding the jar with the custom code to the hbase-indexer Java classpath. The morphline runs inside the hbase-indexer processes which are separate from the hbase processes. It has no impact on the stability of the hbase service.
... View more
11-23-2015
12:11 PM
1 Kudo
You can plug a morphline into hbase-indexer to do some mini ETL on the fly during indexing from HBase into Solr. See the docs: http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/search_hbase_batch_indexer.html and http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/search_etl_morphlines.html
... View more
07-06-2015
10:40 AM
1 Kudo
The SOLR_LOCATOR is a variable that works via simple text substitution (ala unix shell scripts). You can define as many variables as you like within the same morphline config file. For example along these lines: SOLR_LOCATOR_1 : { collection : collection1, zkHost : ${ZK_HOST} } SOLR_LOCATOR_2 : { collection : collection2, zkHost : ${ZK_HOST} } morphlines : [ { id : morphline1 ... { loadSolr { solrLocator : ${SOLR_LOCATOR_1} } } } { id : morphline2 ... { loadSolr { solrLocator : ${SOLR_LOCATOR_2} } } } ] Wolfgang
... View more
06-12-2015
05:54 AM
Try to use the sanitizeUnkownSolrFields command per http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html#sanitizeUnknownSolrFields Wolfgang.
... View more
06-12-2015
03:50 AM
Maybe readAvroContainer fails because your avro data isn't contained in an avro container, in which case use readAvro command instead of readAvroContainer. In any case, to automatically print diagnostic information such as the content of records as they pass through the morphline commands, consider enabling TRACE log level, for example by adding the following line to your log4j.properties file: log4j.logger.org.kitesdk.morphline=TRACE See http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html#logTrace This will also print which command failed where. BTW, questions specific to Cloudera Search are best directed to search-user@cloudera.org via http://groups.google.com/a/cloudera.org/group/search-user Wolfgang
... View more