Created on 11-15-2014 07:16 AM - edited 09-16-2022 02:13 AM
Hi, I use morphline to parse incomming xml and store it to Solr. The problem is that morphline removes all tags. I need to store to Solr a subtree from incomming XML.
Example:
<ecol:body> <out:StatusMessage xmlns:out="http://lol.ru/coordinate/v5/" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <out:ResponseDate xsi:nil="true"/> <out:PlanDate xsi:nil="true"/> <out:StatusCode>1040</out:StatusCode> <out:Responsible> <out:LastName>XXXX</out:LastName> <out:FirstName>YY</out:FirstName> </out:Responsible> <out:Note/> <out:ServiceNumber>123123123</out:ServiceNumber> </out:StatusMessage> </ecol:body>
A part of my morphline config:
return <entry> {$entry/attr:ssoId} {$entry/attr:applicationId} {$entry/../../ecol:body} </entry>
a valaue for a <ecol:body> has: 1040 \n XXXX \n YY 12312312123 and ALL tags are removed. I want to leave tags. Is there anypossibility to do that?
Created 11-25-2014 08:26 AM
Created 11-18-2014 01:41 AM
You need to change your xquery command to wrap your XML output into yet another XML element (e.g. “record”).
For example, in order to generate a morphline record with a “myFoo" field that contains “foo",
as well as a “myBar" field that contains “bar", your xquery command should be formulated such
that it outputs an XML fragment like this:
<record>
<myFoo>foo</myFoo>
<myBar>bar</myBar>
</record>
Created 11-18-2014 01:46 AM
Cool, thanks! I'll try this evening.
Created 11-24-2014 02:14 PM
Hi, I've tried this:
return <entry> {$entry/attr:ssoId} {$entry/attr:applicationId} <body>{$entry/../../ecol:body}</body> </entry>
And that:
return <record> <entry> {$entry/attr:ssoId} {$entry/attr:applicationId} <body>{$entry/../../ecol:body}</body> </entry> </record>
Nothing helps. Looks like I don't understand the idea and how it works.
<body>{$entry/../../ecol:body}</body>
is extracted, but still ALL tags under <ecol:body/> are moved.
What do I do wrong?
Created 11-25-2014 06:15 AM
Result is the same I do get contents of expression {$entry/../../body},
but all tags are removed.
Imagine I had
<body>
<inner>inner text </inner>
</body>
I get "inner text" as a result.
I want to get <inner>inner text </inner> without removed tag
Created 11-25-2014 08:26 AM
Created 11-27-2014 02:09 AM
I've took
org.apache.flume.sink.solr. morphline.UUIDInterceptor$Builder
as an example
My custom interceptor takes event body and stores it in event header.
Then SolrSink takes this header by default and sendt it to Solr for indexing.
it works
NB: solr schema.xml should have matching field declaration.