- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
morphilne config is partially executed without any error and a command in pipe is lost
Created ‎10-14-2014 02:45 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, I've met strange thing. Here is good config:
# Copyright 2013 Cloudera Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. morphlines : [ { id : morphline1 importCommands : ["com.cloudera.**"] commands : [ { xquery { fragments : [ { fragmentPath : "/" queryString : "/tweets/tweet/@text" # each item in result sequence becomes a morphline record } ] } } { logDebug { format : "output record: {}", args : ["@{}"] } } ] } ]
here is it's partial output:
1008 [main] TRACE com.cloudera.cdk.morphline.saxon.XQueryBuilder$XQuery - XQuery result sequence item #1 is of class: net.sf.saxon.tree.tiny.TinyAttributeImpl with value: text="sample tweet one" 1015 [main] TRACE com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug - beforeProcess: {id=[123], text=[sample tweet one]} 1015 [main] DEBUG com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug - output record: [{id=[123], text=[sample tweet one]}] 38023 [main] TRACE com.cloudera.cdk.morphline.saxon.XQueryBuilder$XQuery - XQuery result sequence item #2 is of class: net.sf.saxon.tree.tiny.TinyAttributeImpl with value: text="sample tweet two" 38025 [main] TRACE com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug - beforeProcess: {id=[123], text=[sample tweet two]} 38025 [main] DEBUG com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug - output record: [{id=[123], text=[sample tweet two]}]
Here is "bad" config, only xquery command is executed:
morphlines : [ { id : morphline1 importCommands : ["com.cloudera.**"] commands : [ { xquery { fragments : [ { fragmentPath : "/" queryString : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId/text()" } ] } } { logDebug { format : "output record: {}", args : ["@{}"] } } ] } ]
I clearly see in logs that first config runs w/o any problems, this config is taken from SaxonMorphlineTest
The second config executes w/o any problems but logDebug is not working.
Here is output:
431 [main] TRACE com.cloudera.cdk.morphline.saxon.XQueryBuilder$XQuery - XQuery result sequence item #1 is of class: net.sf.saxon.tree.tiny.TinyTextImpl with value:someValueIWantToGet collector.getRecords()[]
Here is what I see in Idea.
Why does idea "highlights" "xquery" in good config? I'm on linux, I don't see any "bad chars" in text editor.
Created ‎10-15-2014 12:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
?foo?
which isn?t meaningful for a morphline (e.g. into which morphline field should the ?foo? string be stored into and what if you want to return values for multiple fields or multiple records?) Thus, in your case the xquery command returns false and logDebug is never executed, hence you see no logDebug output.
Note that the output of an XQuery must be shaped to conform to the format described here: http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#/xquery
Example XQuery output:
foo
bar
This will generate a morphline record with a myFoo field that contains ?foo", as well as a myBar field that contains ?bar".
Wolfgang.
Created ‎10-15-2014 12:35 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
?foo?
which isn?t meaningful for a morphline (e.g. into which morphline field should the ?foo? string be stored into and what if you want to return values for multiple fields or multiple records?) Thus, in your case the xquery command returns false and logDebug is never executed, hence you see no logDebug output.
Note that the output of an XQuery must be shaped to conform to the format described here: http://kitesdk.org/docs/current/kite-morphlines/morphlinesReferenceGuide.html#/xquery
Example XQuery output:
foo
bar
This will generate a morphline record with a myFoo field that contains ?foo", as well as a myBar field that contains ?bar".
Wolfgang.
Created on ‎10-15-2014 08:18 AM - edited ‎10-15-2014 08:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi, looks like it's a bug, your reply created new sperate forum thread.
I see the defference.
EXample from CDK tests return: text="sample tweet one"
sequence item #1 is of class: net.sf.saxon.tree.tiny.TinyAttributeImpl with value: text="sample tweet one"
And mine doesn't return someName=someValue, only someValue:
431 [main] TRACE com.cloudera.cdk.morphline.saxon.XQueryBuilder$XQuery - XQuery result sequence item #1 is of class: net.sf.saxon.tree.tiny.TinyTextImpl with value:someValueIWantToGet
Twitter-based example takes name "text" becuase it's attribute name from xml?
I've refactored my xq:
queryString : """ for $entry in /collectorEvent/attributes/etpEventCollectorAttributes return <entry> {$entry/ssoId} {$entry/applicationId} </entry> """
now it returns:
{applicationId=[123], ssoId=[someSSO_id}
Thanks, it works!
