Member since
08-21-2013
146
Posts
25
Kudos Received
34
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2407 | 10-24-2016 10:43 AM | |
5133 | 03-13-2016 02:15 PM | |
2875 | 12-11-2015 01:48 AM | |
2236 | 11-23-2015 12:11 PM | |
2114 | 07-06-2015 10:40 AM |
05-23-2017
04:06 PM
Custom morphline commands can maintain state if you need to, so, in principle, it is possible.
... View more
02-13-2017
09:26 AM
The log file will be on the remote hosts that ran the map tasks, not on the host that started the map reduce driver. Wolfgang
... View more
10-24-2016
10:43 AM
Here is a useful related read: http://www.ngdata.com/the-hbase-side-effect-processor-and-hbase-replication-monitoring/
... View more
03-13-2016
02:15 PM
1 Kudo
Looks like you are missing a loadSolr command in your morphline, for example as shown here: see http://www.cloudera.com/documentation/enterprise/latest/topics/search_batch_index_use_mapreduce.html?scroll=csug_topic_4_3 (FYI, with MapReduceIndexerTool the SOLR_LOCATOR is substituted from whatever is specified on the CLI with --zk-host option)
... View more
12-11-2015
01:48 AM
On yarn the params are called mapreduce.map.java.opts and mapreduce.reduce.java.opts. Wolfgang.
... View more
11-23-2015
11:00 PM
Custom morphline commands are deployed by adding the jar with the custom code to the hbase-indexer Java classpath. The morphline runs inside the hbase-indexer processes which are separate from the hbase processes. It has no impact on the stability of the hbase service.
... View more
11-23-2015
12:11 PM
1 Kudo
You can plug a morphline into hbase-indexer to do some mini ETL on the fly during indexing from HBase into Solr. See the docs: http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/search_hbase_batch_indexer.html and http://www.cloudera.com/content/www/en-us/documentation/enterprise/latest/topics/search_etl_morphlines.html
... View more
07-06-2015
10:40 AM
1 Kudo
The SOLR_LOCATOR is a variable that works via simple text substitution (ala unix shell scripts). You can define as many variables as you like within the same morphline config file. For example along these lines: SOLR_LOCATOR_1 : { collection : collection1, zkHost : ${ZK_HOST} } SOLR_LOCATOR_2 : { collection : collection2, zkHost : ${ZK_HOST} } morphlines : [ { id : morphline1 ... { loadSolr { solrLocator : ${SOLR_LOCATOR_1} } } } { id : morphline2 ... { loadSolr { solrLocator : ${SOLR_LOCATOR_2} } } } ] Wolfgang
... View more
06-12-2015
05:54 AM
Try to use the sanitizeUnkownSolrFields command per http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html#sanitizeUnknownSolrFields Wolfgang.
... View more
06-12-2015
03:50 AM
Maybe readAvroContainer fails because your avro data isn't contained in an avro container, in which case use readAvro command instead of readAvroContainer. In any case, to automatically print diagnostic information such as the content of records as they pass through the morphline commands, consider enabling TRACE log level, for example by adding the following line to your log4j.properties file: log4j.logger.org.kitesdk.morphline=TRACE See http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html#logTrace This will also print which command failed where. BTW, questions specific to Cloudera Search are best directed to search-user@cloudera.org via http://groups.google.com/a/cloudera.org/group/search-user Wolfgang
... View more
05-20-2015
06:17 AM
The MorphlineInterceptor expects a byte[] or java.io.InputStream in the _attachment_body field of the morphline output record. This will become the body of the flume output event. In your case the _attachment_body field instead contains a jackson JsonNode object - hence it complains.
... View more
03-30-2015
06:44 AM
The toAvro command expects a java.util.Map as input on conversion to a nested Avro record, per https://github.com/kite-sdk/kite/blob/master/kite-morphlines/kite-morphlines-avro/src/main/java/org/kitesdk/morphline/avro/AvroConversions.java#L73-L87 However, your input data contains a (nested) Jackson JSON object, not a java.util.Map. Hence the conversion can't succeed. Consider writing a custom morphline command that implements whatever conversion rules you wish, per http://kitesdk.org/docs/current/morphlines/morphlines-reference-guide.html#Implementing_your_own_Custom_Command
... View more
03-25-2015
01:16 PM
An empty morphline record field can't be converted to that avro schema, of course. Make sure your input data always matches the avro schema.
... View more
03-25-2015
09:41 AM
> java.lang.NoSuchMethodError: org.apache.avro.reflect.ReflectData.getDefaultValue(Lorg/apache/avro/Schema$FieldLjava/lang/Object; This means you have a wrong avro jar file version on the classpath.
... View more
12-16-2014
11:27 AM
Check the log files of MapReduce job and Solr server. The issue is probably that you are missing a sanitizeUnknownSolrField morphline command in your morphline.
... View more
12-09-2014
10:06 AM
1 Kudo
FWIW, also see https://github.com/typesafehub/config/blob/master/HOCON.md#includes
... View more
12-09-2014
09:55 AM
1 Kudo
Arrange it such that morphlineA pipe records into morphlineB: Command morphlineB = new Compiler().compile(morphlineFileB, morphlineIdB, morphlineContextB, null); Command morphlineA = new Compiler().compile(morphlineFileA, morphlineIdA, morphlineContextA, morphlineB);
... View more
11-27-2014
02:06 AM
FYI, the tryRules command with the catchExceptions : true parameter handles this kind of scenario more easily. http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#/tryRules
... View more
11-25-2014
08:26 AM
That’s the expected behavior per the doc at kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#xquery: "The XPath string value of the attribute or child is filled into the record field”. The “XPath string value” is the concatenation of the *text nodes* and it does not include the element names or attribute names, per www.w3.org/TR/xquery-operators/#func-string Also, Solr/Lucene wouldn’t know that to do with those tag names anyway. A Lucene/Solr field holds primitive type such as a string, it doesn’t work with nested structures. P.S. If you really need this, I believe saxon (and hence the xquery command) has an extension function to emit the serialization of an XML document into a string, but unfortunately that extension function is probably not available in the free "Saxon-HE" version that we ship with kite-morphlines-saxon: www.saxonica.com/documentation9.4-demo/html/extensions/functions/serialize.html Alternatively, you could write your own custom morphline command that implements whatever xquery serialization logic you like, of course. The code would be be a copy n’ paste of the existing xquery command expect for adjusting this part: github.com/kite-sdk/kite/blob/master/kite-morphlines/kite-morphlines-saxon/src/main/java/org/kitesdk/morphline/saxon/XQueryBuilder.java#L196-L198
... View more
11-18-2014
01:41 AM
1 Kudo
You need to change your xquery command to wrap your XML output into yet another XML element (e.g. “record”). For example, in order to generate a morphline record with a “myFoo" field that contains “foo", as well as a “myBar" field that contains “bar", your xquery command should be formulated such that it outputs an XML fragment like this: <record> <myFoo>foo</myFoo> <myBar>bar</myBar> </record>
... View more
11-10-2014
07:10 AM
The "if" command and "equals" command and indeed all morphline commands know nothing about hbase colunmns or hbase qualifiers, except for the extractHBaseCells command. Use extractHBaseCells to extract whatever hbase columns you want into whatever morphline record fields you want, then subsequently use "if", "equals" or similar to act on the morphline record fields (not on hbase columns or qualifiers direcly).
... View more
11-10-2014
03:32 AM
Try equals { id : [] } for example as shown here: http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#if In a morphline record there is no difference between a field with zero values and a field that doesn't exist.
... View more
11-09-2014
12:35 PM
You can express it all in a single morphline. Consider using if-then-else command or the tryRules command or similar in order to check which case applies and execute whatever corresponding logic is appropriate for that case. You can have multiple extractHBaseCells commands in a single morphline, e.g. one in each branch of the tryRules command. Wolfgang.
... View more
11-06-2014
03:39 AM
It’s mentioned in the ref guide for the next upcoming kite version per https://github.com/kite-sdk/kite/blob/master/kite-morphlines/src/site/confluence/morphlinesReferenceGuide.confluence#L2879-L2889
... View more
11-05-2014
10:42 AM
The xquery command expects a byte[] rather than a string as input, and that input must be in the outputField : “_attachment_body” field rather than the "data" field. Try changing the extractHBaseCells command to use type : "byte[]” and outputField : “_attachment_body" Also you need to change your xquery command to wrap your XML output into yet another XML element (e.g. “record”). For example, in order to generate a morphline record with a “myFoo" field that contains “foo", as well as a “myBar" field that contains “bar", your xquery command should be formulated such that it outputs an XML fragment like this: <record> <myFoo>foo</myFoo> <myBar>bar</myBar> </record> Wolfgang.
... View more
11-04-2014
12:33 PM
Try /var/log/solr
... View more
11-04-2014
12:49 AM
The solr schema.xml config file needs to conform to the documents that you are trying to insert. Try adjusting schema.xml accordingly and tell solr about it via the solrctl CLI. Also see http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#sanitizeUnknownSolrFields XPath and XQuery docs are here: http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#xquery The log files of the Solr server and MapReduce tasks, etc can be displayed in the Cloudera Manager GUI. Wolfgang.
... View more