Created 10-14-2014 02:45 PM
Hi, I've met strange thing. Here is good config:
# Copyright 2013 Cloudera Inc. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. morphlines : [ { id : morphline1 importCommands : ["com.cloudera.**"] commands : [ { xquery { fragments : [ { fragmentPath : "/" queryString : "/tweets/tweet/@text" # each item in result sequence becomes a morphline record } ] } } { logDebug { format : "output record: {}", args : ["@{}"] } } ] } ]
here is it's partial output:
1008 [main] TRACE com.cloudera.cdk.morphline.saxon.XQueryBuilder$XQuery - XQuery result sequence item #1 is of class: net.sf.saxon.tree.tiny.TinyAttributeImpl with value: text="sample tweet one" 1015 [main] TRACE com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug - beforeProcess: {id=[123], text=[sample tweet one]} 1015 [main] DEBUG com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug - output record: [{id=[123], text=[sample tweet one]}] 38023 [main] TRACE com.cloudera.cdk.morphline.saxon.XQueryBuilder$XQuery - XQuery result sequence item #2 is of class: net.sf.saxon.tree.tiny.TinyAttributeImpl with value: text="sample tweet two" 38025 [main] TRACE com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug - beforeProcess: {id=[123], text=[sample tweet two]} 38025 [main] DEBUG com.cloudera.cdk.morphline.stdlib.LogDebugBuilder$LogDebug - output record: [{id=[123], text=[sample tweet two]}]
Here is "bad" config, only xquery command is executed:
morphlines : [ { id : morphline1 importCommands : ["com.cloudera.**"] commands : [ { xquery { fragments : [ { fragmentPath : "/" queryString : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId/text()" } ] } } { logDebug { format : "output record: {}", args : ["@{}"] } } ] } ]
I clearly see in logs that first config runs w/o any problems, this config is taken from SaxonMorphlineTest
The second config executes w/o any problems but logDebug is not working.
Here is output:
431 [main] TRACE com.cloudera.cdk.morphline.saxon.XQueryBuilder$XQuery - XQuery result sequence item #1 is of class: net.sf.saxon.tree.tiny.TinyTextImpl with value:someValueIWantToGet collector.getRecords()[]
Here is what I see in Idea.
Why does idea "highlights" "xquery" in good config? I'm on linux, I don't see any "bad chars" in text editor.
Created 10-15-2014 12:35 AM
Created 10-15-2014 12:35 AM
Created on 10-15-2014 08:18 AM - edited 10-15-2014 08:43 AM
Hi, looks like it's a bug, your reply created new sperate forum thread.
I see the defference.
EXample from CDK tests return: text="sample tweet one"
sequence item #1 is of class: net.sf.saxon.tree.tiny.TinyAttributeImpl with value: text="sample tweet one"
And mine doesn't return someName=someValue, only someValue:
431 [main] TRACE com.cloudera.cdk.morphline.saxon.XQueryBuilder$XQuery - XQuery result sequence item #1 is of class: net.sf.saxon.tree.tiny.TinyTextImpl with value:someValueIWantToGet
Twitter-based example takes name "text" becuase it's attribute name from xml?
I've refactored my xq:
queryString : """ for $entry in /collectorEvent/attributes/etpEventCollectorAttributes return <entry> {$entry/ssoId} {$entry/applicationId} </entry> """
now it returns:
{applicationId=[123], ssoId=[someSSO_id}
Thanks, it works!