Archives of Support Questions (Read Only)

This is an archived board for historical reference. Information and links may no longer be available or relevant
Announcements
This board is archived and read-only for historical reference. To ask a new question, please post a new topic on the appropriate active board.

Can I write a storm topology with Phython?

avatar
New Member

I know that it sounds a bit crazy, but I am a data scientist 🙂 and also my preferred language is still Python. I use it with Spark, but I will like to be able to implement some smart models within a Storm topology. We did not adopt Spark streaming and Storm is still working best with Kafka. Any pointers on how to start this?

1 ACCEPTED SOLUTION

avatar
New Member

@Boris Demerov

I guess we are both data scientists.

Storm comes with Python and Ruby.

The right place to start is src/storm.thrift. Since Storm topologies are just Thrift structures, and Nimbus is a Thrift daemon, you can create and submit topologies in any language. Here's a specification of the protocol: Multilang protocol. The thrift structure lets you define multilang components explicitly as a program and a script, e.g., python and the file implementing your bolt. Multilang uses json messages over stdin/stdout to communicate with the sub-process.

Python supports emitting, anchoring, acking, and logging. Storm "shell" command makes constructing jar and uploading to nimbus easy.

Here is a good reference:

https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-develop-python-topology

View solution in original post

1 REPLY 1

avatar
New Member

@Boris Demerov

I guess we are both data scientists.

Storm comes with Python and Ruby.

The right place to start is src/storm.thrift. Since Storm topologies are just Thrift structures, and Nimbus is a Thrift daemon, you can create and submit topologies in any language. Here's a specification of the protocol: Multilang protocol. The thrift structure lets you define multilang components explicitly as a program and a script, e.g., python and the file implementing your bolt. Multilang uses json messages over stdin/stdout to communicate with the sub-process.

Python supports emitting, anchoring, acking, and logging. Storm "shell" command makes constructing jar and uploading to nimbus easy.

Here is a good reference:

https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-develop-python-topology