Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

morphlines, kite, solr and flume

avatar
Expert Contributor

Hi, I want to catch xml payload using flume and use morphlines to put parsed data to solr.

Now I have a deep misunderstanding. What do I have to use cdk-morphlines or kite?

I have a config:

 

morphlines : [
  {
    id : morphline1
    importCommands : ["com.cloudera.**"]

    commands : [
      {
        xquery {
          fragments : [
            {
              fragmentPath : "/"
              queryString : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId"
            }
          ]
        }
      }

      { logDebug { format : "output record: {}", args : ["@{}"] } }
    ]
  }
]

 it runs for cdk and doesn't work for kite environment. I do get an exception while trying to run test:

 

org.kitesdk.morphline.api.MorphlineCompilationException: No command builder registered for name: xquery near: {
# target/test-classes/morphlines/dummy-xml.conf: 8
"xquery" : {
# target/test-classes/morphlines/dummy-xml.conf: 9
"fragments" : [
# target/test-classes/morphlines/dummy-xml.conf: 10
{
# target/test-classes/morphlines/dummy-xml.conf: 12
"queryString" : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId",
# target/test-classes/morphlines/dummy-xml.conf: 11
"fragmentPath" : "/"
}
]
}
}

 

why? and what would work with flume?

1 ACCEPTED SOLUTION

avatar
Super Collaborator
CDH 4.x uses CDK whereas CDH 5.x uses Kite. The diff is just in the package names.

View solution in original post

6 REPLIES 6

avatar
Super Collaborator
This info is in section "version 0.10.0" at http://kitesdk.org/docs/current/release_notes.html

Wolfgang.

avatar
Expert Contributor

Hi!

I've passed this guide:

http://www.cloudera.com/content/cloudera/en/documentation/cloudera-search/v1-latest/Cloudera-Search-...

 

I've spent some time to make it work from Cloudera Manager nd it really works.

What confuses me:

Here are imports from tutorial

 # Import all morphline commands in these java packages and their subpackages.
    # Other commands that may be present on the classpath are not visible to this morphline.
    importCommands : ["com.cloudera.**", "org.apache.solr.**"]

 

and kite has diffrent configuration in it's examples...It looks even more complicated than CDK example-tutorial.

avatar
Super Collaborator
CDH 4.x uses CDK whereas CDH 5.x uses Kite. The diff is just in the package names.

avatar
Expert Contributor

Ok, so if I migrate to CDH5 I have to refactor my morpflines.conf?

 

 This config works for cdk

 

morphlines : [
  {
    id : morphline1
    importCommands : ["com.cloudera.**"]

    commands : [
      {
        xquery {
          fragments : [
            {
              fragmentPath : "/"
              queryString : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId"
            }
          ]
        }
      }

      { logDebug { format : "output record: {}", args : ["@{}"] } }
    ]
  }
]

 and fails for kite with exception:

 

org.kitesdk.morphline.api.MorphlineCompilationException: No command builder registered for name: xquery near: {
    # target/test-classes/morphlines/dummy-xml.conf: 8
    "xquery" : {
        # target/test-classes/morphlines/dummy-xml.conf: 9
        "fragments" : [
            # target/test-classes/morphlines/dummy-xml.conf: 10
            {
                # target/test-classes/morphlines/dummy-xml.conf: 12
                "queryString" : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId",
                # target/test-classes/morphlines/dummy-xml.conf: 11
                "fragmentPath" : "/"
            }
        ]
    }
}

 here is my kite-based test:

 

import org.junit.Test
import org.kitesdk.morphline.api.AbstractMorphlineTest
import org.kitesdk.morphline.api.Record
import org.kitesdk.morphline.base.Fields

/**
 * User: sergey.sheypak
 * Date: 13.10.14
 * Time: 20:30
 */
class ParseDummyXmlTest extends AbstractMorphlineTest {

    @Test
    void testParseDummyXml(){
        morphline = createMorphline('morphlines/dummy-xml');
        def record = new Record()
        record.put(Fields.ATTACHMENT_BODY, readDummyXml());
        processAndVerifySuccess(record, null);
    }


    InputStream readDummyXml(){
        this.class.classLoader.getResourceAsStream('dummy.xml')
    }

    private void processAndVerifySuccess(Record input, Record expected) {
        collector.reset();
        startSession();
        morphline.process(input)
        collector.getFirstRecord()
    }
}

 my kite dependencies are:

        <dependency>
            <groupId>org.kitesdk</groupId>
            <artifactId>kite-morphlines-all</artifactId>
            <version>0.17.0</version>
            <type>pom</type>
        </dependency>
        <dependency>
            <groupId>org.kitesdk</groupId>
            <artifactId>kite-morphlines-core</artifactId>
            <type>test-jar</type>
            <scope>test</scope>
            <version>0.17.0</version>
        </dependency>
        <dependency>
            <groupId>org.kitesdk</groupId>
            <artifactId>kite-morphlines-saxon</artifactId>
            <version>0.17.0</version>
        </dependency>

 

avatar
Super Collaborator

avatar
Expert Contributor

Thanks for your patience,