Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

storm 1.0.1 - python 2.7 execution latency

Highlighted

storm 1.0.1 - python 2.7 execution latency

New Contributor

I have a Storm topology running in a distributed environment across 5 linux nodes.

i have one nimbus and 4 supervisors. I have a Kafka Spout that receives a message and then forwards it into a ParseBolt that will parse the raw message and send it to a prediction bolt then to a hbase bolt. the prediction bolt is a python 2.7 bolt.

my major problem is that this topology takes lot of time to execute prediction bolt and the latency reach 30 minutes sometimes. i have:

  • 24 workers
  • 96 topology.worker.shared.thread.pool.size
  • 6 supervisor ports
  • 40 parallelism_hint (for prediction bolt)
  • 40 num task (for prediction bolt)
  • nimbus.childopts -Xmx2048m
  • supervisor.childopts -Xmx512m
  • ui.childopts -Xmx4096m
  • drpc.childopts -Xmx4096m
  • worker.childopts -Xmx8196m
  • logviewer.childopts -Xmx128m

how can i reduce latency and reach a real time prediction bolt.

Any help will be appreciated