Support Questions

Find answers, ask questions, and share your expertise

Can I write a storm topology with Phython?

avatar
Contributor

I know that it sounds a bit crazy, but I am a data scientist 🙂 and also my preferred language is still Python. I use it with Spark, but I will like to be able to implement some smart models within a Storm topology. We did not adopt Spark streaming and Storm is still working best with Kafka. Any pointers on how to start this?

1 ACCEPTED SOLUTION

avatar
Contributor

@Boris Demerov

I guess we are both data scientists.

Storm comes with Python and Ruby.

The right place to start is src/storm.thrift. Since Storm topologies are just Thrift structures, and Nimbus is a Thrift daemon, you can create and submit topologies in any language. Here's a specification of the protocol: Multilang protocol. The thrift structure lets you define multilang components explicitly as a program and a script, e.g., python and the file implementing your bolt. Multilang uses json messages over stdin/stdout to communicate with the sub-process.

Python supports emitting, anchoring, acking, and logging. Storm "shell" command makes constructing jar and uploading to nimbus easy.

Here is a good reference:

https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-develop-python-topology

View solution in original post

1 REPLY 1

avatar
Contributor

@Boris Demerov

I guess we are both data scientists.

Storm comes with Python and Ruby.

The right place to start is src/storm.thrift. Since Storm topologies are just Thrift structures, and Nimbus is a Thrift daemon, you can create and submit topologies in any language. Here's a specification of the protocol: Multilang protocol. The thrift structure lets you define multilang components explicitly as a program and a script, e.g., python and the file implementing your bolt. Multilang uses json messages over stdin/stdout to communicate with the sub-process.

Python supports emitting, anchoring, acking, and logging. Storm "shell" command makes constructing jar and uploading to nimbus easy.

Here is a good reference:

https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-develop-python-topology