Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Storm and Kafka Development Cycle - Best Practices

avatar
Expert Contributor

Wanted to ask what process our Storm and Kafka developers & testers follow when they’re developing new topologies? The process of building a Storm/Kafka topology on laptop, copying files to cluster or sandbox, and testing can be time consuming and difficult to iterate quickly. Do you have any tips you could pass on for efficient development process:

  1. Do you develop on the same machine (linux or mac) which is also running your Storm/Kafka cluster?
  2. Do you instead rely on JUnit tests to instantiate and test Storm bolts on their own and manually trigger Tuples through the execute() method (example here)?

Thanks for any tips you can pass along to help speed up our development.

1 ACCEPTED SOLUTION

avatar
Super Collaborator

I have used and built some testing utilities to be able to create unit tests and integration tests of topologies that use kafka, solr and other components. These provide the capabilities to deploy a topology from your desktop machine to your dev cluster, etc..

Check out the following github links for examples:

  1. Unit Test Example
  2. Integration Test Example

The test utilities it uses can be found here:

Also check out the Hortonworks Gallery, it has a cool project that is being featured called Mini Clusters by @skumpf@hortonworks.com that has some great testing utilities to run clusters in local mode.

View solution in original post

7 REPLIES 7

avatar

Its fairly easy to setup kafka on local machine.

Download kafka from here https://kafka.apache.org/downloads.html.

Unzip kafka_2.10-0.8.2.0.tgz

run ./bin/zookeeper-server-start.sh config/zookeeper.properties

run ./bin/kafka-server-start.sh config/server.properties

once you've kafka up and running you can use storm LocalCluster to deploy and test your topology. More details on LocalCluster https://storm.apache.org/documentation/Local-mode.html .

avatar

Doesn't local storm instance not work for them? E.g. most of online examples check to see if topology is being launched in the same JVM or a distributed cluster.

One of my customers has built a web utility to be able to remotely submit topology jar to the cluster. I'd imagine it's easy to build with any REST framework.

This might be a good idea to suggest

avatar
Super Collaborator

I have used and built some testing utilities to be able to create unit tests and integration tests of topologies that use kafka, solr and other components. These provide the capabilities to deploy a topology from your desktop machine to your dev cluster, etc..

Check out the following github links for examples:

  1. Unit Test Example
  2. Integration Test Example

The test utilities it uses can be found here:

Also check out the Hortonworks Gallery, it has a cool project that is being featured called Mini Clusters by @skumpf@hortonworks.com that has some great testing utilities to run clusters in local mode.

avatar
Super Collaborator

I plan to move these this from my github to the Gallery

avatar
Rising Star

I have had success setting up an Eclipse project for Storm/Kafka and using the tools there to semi automate the build and deployment. Anyone else tried that?

avatar
New Contributor

I found using some simple ssh scripts made it quicker for me:

mvn package

ssh root@storm-server <<'ENDSSH'
storm kill <current topology name>
ENDSSH

scp target/<topology>.jar root@<storm server>:
scp src/main/resources/config.properties root@<storm server>:


ssh root@<storm server> <<'ENDSSH'
storm jar <topology>.jar <main class> config.properties
ENDSSH

avatar
Master Mentor

using localcluster utility from storm works pretty well. I also used examples from https://github.com/nathanmarz/storm-starter/blob/master/test/jvm/storm/starter/bolt/RollingCountBolt... for my own unit tests. MockTuple is the way to go