Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Storm and Kafka Development Cycle - Best Practices

SOLVED Go to solution
Highlighted

Storm and Kafka Development Cycle - Best Practices

Contributor

Wanted to ask what process our Storm and Kafka developers & testers follow when they’re developing new topologies? The process of building a Storm/Kafka topology on laptop, copying files to cluster or sandbox, and testing can be time consuming and difficult to iterate quickly. Do you have any tips you could pass on for efficient development process:

  1. Do you develop on the same machine (linux or mac) which is also running your Storm/Kafka cluster?
  2. Do you instead rely on JUnit tests to instantiate and test Storm bolts on their own and manually trigger Tuples through the execute() method (example here)?

Thanks for any tips you can pass along to help speed up our development.

1 ACCEPTED SOLUTION

Accepted Solutions

Re: Storm and Kafka Development Cycle - Best Practices

Rising Star

I have used and built some testing utilities to be able to create unit tests and integration tests of topologies that use kafka, solr and other components. These provide the capabilities to deploy a topology from your desktop machine to your dev cluster, etc..

Check out the following github links for examples:

  1. Unit Test Example
  2. Integration Test Example

The test utilities it uses can be found here:

Also check out the Hortonworks Gallery, it has a cool project that is being featured called Mini Clusters by @skumpf@hortonworks.com that has some great testing utilities to run clusters in local mode.

7 REPLIES 7

Re: Storm and Kafka Development Cycle - Best Practices

Its fairly easy to setup kafka on local machine.

Download kafka from here https://kafka.apache.org/downloads.html.

Unzip kafka_2.10-0.8.2.0.tgz

run ./bin/zookeeper-server-start.sh config/zookeeper.properties

run ./bin/kafka-server-start.sh config/server.properties

once you've kafka up and running you can use storm LocalCluster to deploy and test your topology. More details on LocalCluster https://storm.apache.org/documentation/Local-mode.html .

Re: Storm and Kafka Development Cycle - Best Practices

Doesn't local storm instance not work for them? E.g. most of online examples check to see if topology is being launched in the same JVM or a distributed cluster.

One of my customers has built a web utility to be able to remotely submit topology jar to the cluster. I'd imagine it's easy to build with any REST framework.

This might be a good idea to suggest

Re: Storm and Kafka Development Cycle - Best Practices

Rising Star

I have used and built some testing utilities to be able to create unit tests and integration tests of topologies that use kafka, solr and other components. These provide the capabilities to deploy a topology from your desktop machine to your dev cluster, etc..

Check out the following github links for examples:

  1. Unit Test Example
  2. Integration Test Example

The test utilities it uses can be found here:

Also check out the Hortonworks Gallery, it has a cool project that is being featured called Mini Clusters by @skumpf@hortonworks.com that has some great testing utilities to run clusters in local mode.

Re: Storm and Kafka Development Cycle - Best Practices

Rising Star

I plan to move these this from my github to the Gallery

Re: Storm and Kafka Development Cycle - Best Practices

Contributor

I have had success setting up an Eclipse project for Storm/Kafka and using the tools there to semi automate the build and deployment. Anyone else tried that?

Re: Storm and Kafka Development Cycle - Best Practices

New Contributor

I found using some simple ssh scripts made it quicker for me:

mvn package

ssh root@storm-server <<'ENDSSH'
storm kill <current topology name>
ENDSSH

scp target/<topology>.jar root@<storm server>:
scp src/main/resources/config.properties root@<storm server>:


ssh root@<storm server> <<'ENDSSH'
storm jar <topology>.jar <main class> config.properties
ENDSSH

Re: Storm and Kafka Development Cycle - Best Practices

Mentor

using localcluster utility from storm works pretty well. I also used examples from https://github.com/nathanmarz/storm-starter/blob/master/test/jvm/storm/starter/bolt/RollingCountBolt... for my own unit tests. MockTuple is the way to go