Created on 10-14-2014 09:53 AM - edited 09-16-2022 02:09 AM
Hi, I want to catch xml payload using flume and use morphlines to put parsed data to solr.
Now I have a deep misunderstanding. What do I have to use cdk-morphlines or kite?
I have a config:
morphlines : [ { id : morphline1 importCommands : ["com.cloudera.**"] commands : [ { xquery { fragments : [ { fragmentPath : "/" queryString : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId" } ] } } { logDebug { format : "output record: {}", args : ["@{}"] } } ] } ]
it runs for cdk and doesn't work for kite environment. I do get an exception while trying to run test:
org.kitesdk.morphline.api.MorphlineCompilationException: No command builder registered for name: xquery near: {
# target/test-classes/morphlines/dummy-xml.conf: 8
"xquery" : {
# target/test-classes/morphlines/dummy-xml.conf: 9
"fragments" : [
# target/test-classes/morphlines/dummy-xml.conf: 10
{
# target/test-classes/morphlines/dummy-xml.conf: 12
"queryString" : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId",
# target/test-classes/morphlines/dummy-xml.conf: 11
"fragmentPath" : "/"
}
]
}
}
why? and what would work with flume?
Created 10-14-2014 10:32 AM
Created 10-14-2014 10:05 AM
Created on 10-14-2014 10:28 AM - edited 10-14-2014 10:29 AM
Hi!
I've passed this guide:
I've spent some time to make it work from Cloudera Manager nd it really works.
What confuses me:
Here are imports from tutorial
# Import all morphline commands in these java packages and their subpackages. # Other commands that may be present on the classpath are not visible to this morphline. importCommands : ["com.cloudera.**", "org.apache.solr.**"]
and kite has diffrent configuration in it's examples...It looks even more complicated than CDK example-tutorial.
Created 10-14-2014 10:32 AM
Created 10-14-2014 11:09 AM
Ok, so if I migrate to CDH5 I have to refactor my morpflines.conf?
This config works for cdk
morphlines : [ { id : morphline1 importCommands : ["com.cloudera.**"] commands : [ { xquery { fragments : [ { fragmentPath : "/" queryString : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId" } ] } } { logDebug { format : "output record: {}", args : ["@{}"] } } ] } ]
and fails for kite with exception:
org.kitesdk.morphline.api.MorphlineCompilationException: No command builder registered for name: xquery near: { # target/test-classes/morphlines/dummy-xml.conf: 8 "xquery" : { # target/test-classes/morphlines/dummy-xml.conf: 9 "fragments" : [ # target/test-classes/morphlines/dummy-xml.conf: 10 { # target/test-classes/morphlines/dummy-xml.conf: 12 "queryString" : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId", # target/test-classes/morphlines/dummy-xml.conf: 11 "fragmentPath" : "/" } ] } }
here is my kite-based test:
import org.junit.Test import org.kitesdk.morphline.api.AbstractMorphlineTest import org.kitesdk.morphline.api.Record import org.kitesdk.morphline.base.Fields /** * User: sergey.sheypak * Date: 13.10.14 * Time: 20:30 */ class ParseDummyXmlTest extends AbstractMorphlineTest { @Test void testParseDummyXml(){ morphline = createMorphline('morphlines/dummy-xml'); def record = new Record() record.put(Fields.ATTACHMENT_BODY, readDummyXml()); processAndVerifySuccess(record, null); } InputStream readDummyXml(){ this.class.classLoader.getResourceAsStream('dummy.xml') } private void processAndVerifySuccess(Record input, Record expected) { collector.reset(); startSession(); morphline.process(input) collector.getFirstRecord() } }
my kite dependencies are:
<dependency> <groupId>org.kitesdk</groupId> <artifactId>kite-morphlines-all</artifactId> <version>0.17.0</version> <type>pom</type> </dependency> <dependency> <groupId>org.kitesdk</groupId> <artifactId>kite-morphlines-core</artifactId> <type>test-jar</type> <scope>test</scope> <version>0.17.0</version> </dependency> <dependency> <groupId>org.kitesdk</groupId> <artifactId>kite-morphlines-saxon</artifactId> <version>0.17.0</version> </dependency>
Created 10-14-2014 11:14 AM
Created 10-14-2014 11:30 AM
Thanks for your patience,