Member since
07-29-2013
162
Posts
8
Kudos Received
7
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
7079 | 05-06-2015 06:52 AM | |
3075 | 06-09-2014 10:51 PM | |
4971 | 01-30-2014 10:40 PM | |
3655 | 08-22-2013 12:28 AM | |
5065 | 08-18-2013 11:23 PM |
10-14-2014
11:09 AM
Ok, so if I migrate to CDH5 I have to refactor my morpflines.conf? This config works for cdk morphlines : [
{
id : morphline1
importCommands : ["com.cloudera.**"]
commands : [
{
xquery {
fragments : [
{
fragmentPath : "/"
queryString : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId"
}
]
}
}
{ logDebug { format : "output record: {}", args : ["@{}"] } }
]
}
] and fails for kite with exception: org.kitesdk.morphline.api.MorphlineCompilationException: No command builder registered for name: xquery near: {
# target/test-classes/morphlines/dummy-xml.conf: 8
"xquery" : {
# target/test-classes/morphlines/dummy-xml.conf: 9
"fragments" : [
# target/test-classes/morphlines/dummy-xml.conf: 10
{
# target/test-classes/morphlines/dummy-xml.conf: 12
"queryString" : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId",
# target/test-classes/morphlines/dummy-xml.conf: 11
"fragmentPath" : "/"
}
]
}
} here is my kite-based test: import org.junit.Test
import org.kitesdk.morphline.api.AbstractMorphlineTest
import org.kitesdk.morphline.api.Record
import org.kitesdk.morphline.base.Fields
/**
* User: sergey.sheypak
* Date: 13.10.14
* Time: 20:30
*/
class ParseDummyXmlTest extends AbstractMorphlineTest {
@Test
void testParseDummyXml(){
morphline = createMorphline('morphlines/dummy-xml');
def record = new Record()
record.put(Fields.ATTACHMENT_BODY, readDummyXml());
processAndVerifySuccess(record, null);
}
InputStream readDummyXml(){
this.class.classLoader.getResourceAsStream('dummy.xml')
}
private void processAndVerifySuccess(Record input, Record expected) {
collector.reset();
startSession();
morphline.process(input)
collector.getFirstRecord()
}
} my kite dependencies are: <dependency>
<groupId>org.kitesdk</groupId>
<artifactId>kite-morphlines-all</artifactId>
<version>0.17.0</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>org.kitesdk</groupId>
<artifactId>kite-morphlines-core</artifactId>
<type>test-jar</type>
<scope>test</scope>
<version>0.17.0</version>
</dependency>
<dependency>
<groupId>org.kitesdk</groupId>
<artifactId>kite-morphlines-saxon</artifactId>
<version>0.17.0</version>
</dependency>
... View more
10-14-2014
10:28 AM
Hi! I've passed this guide: http://www.cloudera.com/content/cloudera/en/documentation/cloudera-search/v1-latest/Cloudera-Search-User-Guide/csug_deploy_solr_sink_flume_agent.html I've spent some time to make it work from Cloudera Manager nd it really works. What confuses me: Here are imports from tutorial # Import all morphline commands in these java packages and their subpackages.
# Other commands that may be present on the classpath are not visible to this morphline.
importCommands : ["com.cloudera.**", "org.apache.solr.**"] and kite has diffrent configuration in it's examples...It looks even more complicated than CDK example-tutorial.
... View more
10-14-2014
09:53 AM
Hi, I want to catch xml payload using flume and use morphlines to put parsed data to solr. Now I have a deep misunderstanding. What do I have to use cdk-morphlines or kite? I have a config: morphlines : [
{
id : morphline1
importCommands : ["com.cloudera.**"]
commands : [
{
xquery {
fragments : [
{
fragmentPath : "/"
queryString : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId"
}
]
}
}
{ logDebug { format : "output record: {}", args : ["@{}"] } }
]
}
] it runs for cdk and doesn't work for kite environment. I do get an exception while trying to run test: org.kitesdk.morphline.api.MorphlineCompilationException: No command builder registered for name: xquery near: { # target/test-classes/morphlines/dummy-xml.conf: 8 "xquery" : { # target/test-classes/morphlines/dummy-xml.conf: 9 "fragments" : [ # target/test-classes/morphlines/dummy-xml.conf: 10 { # target/test-classes/morphlines/dummy-xml.conf: 12 "queryString" : "/collectorEvent/attributes/etpEventCollectorAttributes/ssoId", # target/test-classes/morphlines/dummy-xml.conf: 11 "fragmentPath" : "/" } ] } } why? and what would work with flume?
... View more
Labels:
- Labels:
-
Apache Flume
-
Apache Solr
10-14-2014
09:37 AM
Is there any possibility to contribute to project? It would be great to decouple "test basement". I see these major problems: 1. tightly coupled with junit 2. i have to download dozens of deps to make it run 3. protected static final java.lang.String RESOURCES_DIR = "target/test-classes"; forces me to put configs under test resource. What is the reason to hardcode it. I do get java.io.FileNotFoundException: File not found: target/test-classes/dummy-xml.conf while trying to run my test 😞 I did put config to desired place then it just throws NPE java.lang.NullPointerException: null at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:187) at org.kitesdk.morphline.base.AbstractCommand.<init>(AbstractCommand.java:71) at org.kitesdk.morphline.stdlib.Pipe.<init>(Pipe.java:38) at org.kitesdk.morphline.stdlib.PipeBuilder.build(PipeBuilder.java:40) checkNotNull what? Not to much info to make it work 😞
... View more
10-14-2014
01:04 AM
Oh, I've used wrong artifact, here is the right with <type>test-jar</type>: Thanks! <dependency> <groupId>org.kitesdk</groupId> <artifactId>kite-morphlines-core</artifactId> <type>test-jar</type> <scope>test</scope> <version>${kite-version}</version> </dependency>
... View more
10-13-2014
01:26 PM
Hi, thanks for the reply. It really looks like more "debug", than "test". I do expect something like: //groovy-like pseudocode using hamcrest @Test void testParseSmthUsingMorphline(){ def aResult = doSomeTrickyStuff('a_path_to_morphline_config', 'a_path_to_input_dataset') assertThat(result, hasSize(3)) assertThat(result.get(0).get('myProperty'), equalTo('some cool value')) } P.S. Please add code highlighting!
... View more
10-13-2014
10:38 AM
Hi, I've seen this: https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-core/src/test/java/com/cloudera/cdk/morphline/api/MorphlineDemo.java And I have no idea how to get access to parsed records. I've seen this: https://github.com/cloudera/cdk/blob/master/cdk-morphlines/cdk-morphlines-saxon/src/test/java/com/cloudera/cdk/morphline/saxon/SaxonMorphlineTest.java and I can't use it, because 1. it uses junit 2. because can't get access to com.cloudera.cdk.morphline.api.Collector, I don't see where artifact with classifier "test" is published. What are the right approaches?
... View more
06-09-2014
10:51 PM
2 Kudos
Vikram Srivastava helped me in google groups. Here is an explanation: The alternatives priority for HDFS is by default configured lower than MapReduce, so deploying HDFS client configs only will not update what /etc/hadoop/conf points to. I've filed an internal issue for this to warn users that they need to deploy cluster client configs rather than individual services. Hope it would help other hadoopers 🙂
... View more
06-09-2014
04:08 AM
This problem is related only to HDFS service. I did deploy client conf of MapReduce service. It updates client conf mapred-site.xml and hdfs-site.xml I do see updated hdfs-site.xml The other problem with HDFS service is that i can't DELETE any role (DN, Gateway, Journal node). Cloudera manager just starts to consume 100% cPU and jstack reports therad dead lock...
... View more
06-09-2014
03:31 AM
Hi, we have NN HA on quorum Journal. We got failed namenode recently. We did replace it with new one. HDFS works, it's possible to read/write data. I do click 'download clinet configuration' and see that hdfs-site.xml has right settings for NN HA service configuration. When I click 'deploy client configuration' nohing happens. /etc/hadoop/conf/hdfs-site.xml still have old configuration. It refenreces deleted NN role. last modified time is not changed also. Looks like it's not updated by CM... How can we fix it?
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Cloudera Manager
-
HDFS