Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Can I write a storm topology with Phython?

Solved Go to solution

Can I write a storm topology with Phython?

Explorer

I know that it sounds a bit crazy, but I am a data scientist :) and also my preferred language is still Python. I use it with Spark, but I will like to be able to implement some smart models within a Storm topology. We did not adopt Spark streaming and Storm is still working best with Kafka. Any pointers on how to start this?

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Can I write a storm topology with Phython?

Explorer

@Boris Demerov

I guess we are both data scientists.

Storm comes with Python and Ruby.

The right place to start is src/storm.thrift. Since Storm topologies are just Thrift structures, and Nimbus is a Thrift daemon, you can create and submit topologies in any language. Here's a specification of the protocol: Multilang protocol. The thrift structure lets you define multilang components explicitly as a program and a script, e.g., python and the file implementing your bolt. Multilang uses json messages over stdin/stdout to communicate with the sub-process.

Python supports emitting, anchoring, acking, and logging. Storm "shell" command makes constructing jar and uploading to nimbus easy.

Here is a good reference:

https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-develop-python-topology

View solution in original post

1 REPLY 1
Highlighted

Re: Can I write a storm topology with Phython?

Explorer

@Boris Demerov

I guess we are both data scientists.

Storm comes with Python and Ruby.

The right place to start is src/storm.thrift. Since Storm topologies are just Thrift structures, and Nimbus is a Thrift daemon, you can create and submit topologies in any language. Here's a specification of the protocol: Multilang protocol. The thrift structure lets you define multilang components explicitly as a program and a script, e.g., python and the file implementing your bolt. Multilang uses json messages over stdin/stdout to communicate with the sub-process.

Python supports emitting, anchoring, acking, and logging. Storm "shell" command makes constructing jar and uploading to nimbus easy.

Here is a good reference:

https://docs.microsoft.com/en-us/azure/hdinsight/storm/apache-storm-develop-python-topology

View solution in original post

Don't have an account?
Coming from Hortonworks? Activate your account here