Member since
03-16-2017
10
Posts
0
Kudos Received
0
Solutions
05-13-2017
04:31 PM
Hi.. I am trying to setup a baseline for performance of our topology. I started with 1 worker node and 1 worker processes We have 1 Kafka Spout, 1 Processing Bolt and 1 HdfsBolt. I set the parallelism for all of them to 1. Below are other storm configs. We are using 1 D14_V2 in Azure for our worker/supervisor node. 2 D4 v2 for Nimbus and 3 D12 v2 for Zookeeprs
worker.childopts: -Xmx4000m-Xms4000m (4GB memory) log level: WARN supervisor.childopts: -Xmx1024m _JAAS_PLACEHOLDER
-Dcom.sun.management.jmxremote -Dcom.sun.management.jmxremote.ssl=false
-Dcom.sun.management.jmxremote.authenticate=false
-Dcom.sun.management.jmxremote.port={{jmxremote_port}}
-javaagent:/usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/lib/jmxetric-1.0.4.jar=host=localhost,port=8650,wireformat31x=true,mode=multicast,config=/usr/hdp/current/storm-supervisor/contrib/storm-jmxetric/conf/jmxetric-conf.xml,process=Supervisor_JVM
-Detwlogger.component=supervisor topology.max.spout.pending: 250 topology.worker.max.heap.size.mb: 4000 topology.acker.executors: 6 topology.workers: 1 I deployed the topology to the Storm cluster and pumped messages into Kafka topic. But KafkaSpout doesn't consume any of those messages. I did check to make sure the topic has messages. In our KafkaSpout, we give a static list of broker nodes list. I tried telnet from the worker node to both Kafka brokers and zookeepers and both of them were successful in getting connections. Does anyone have any insight into what could be wrong with KafkaSpout thats preventing it from consuming messages from kafka?
... View more
Labels:
04-05-2017
08:10 PM
Hi,
We are using Apache Flux to do Storm configurations. I am looking into ways to reproduce the below in Flux specifically how to set the memoryLoad on Spouts and Bolts via flux config.yaml. If anyone did this below, please let me know. We are using version 1.0.1
topologyBuilder.setSpout("xxx",xx,10).setMemoryLoad(350000.0,2048.0).setCPULoad(95);
topologyBuilder.setBolt("xx",xx,140).localOrShuffleGrouping("xx").setMemoryLoad(350000.0,2048.0).setCPULoad(95);
... View more
Labels:
03-29-2017
03:07 AM
@Ambud Sharma we are testing this change and will accept once we are done. I am still not 100% convinced that this solves the problem since the Storm documentation says BasicBolt does the acking and anchoring http://storm.apache.org/releases/1.0.1/Guaranteeing-message-processing.html Search for BasicBolt in that link and you will find "Storm has an interface called BasicBolt that encapsulates this pattern for you."
... View more
03-22-2017
07:00 PM
This is what was mentioned in Storm Applied book "The beauty of using BaseBasicBolt as our base class is that it automatically provides anchoring and acking for us." and we are using BaseBasicBolt. Are you saying that this is incorrect?
... View more
03-22-2017
03:40 PM
@Ambud Sharma wondering if u have more insight. Let me know if you need more details. TIA
... View more
03-21-2017
03:21 PM
@Ambud SharmaYes. There is a case where the message from Bolt 2 doesn't get written but from bolt3 should get written. But if Bolt2 output is written, Bolt 3 output should always be there. vice versa is not true. Is that a problem? We are not anchoring tuples. We are extending BaseBasicBolt and from I understand we need to anchor tuples only if we extend BaseRichBolt..Is that incorrect? No, we are not doing any microbatching.
... View more
03-17-2017
11:43 AM
Hi all..We are noticing that there are some messages
which get lost during storm processing..below is a brief outline of our
pipeline. We have messages coming to Kafka which then get consumed by 2
different kafka spouts in Storm. One Spout writes the message to raw
stream and other storm starts processing the message. We need to store
the output of Bolt2 to HDFS and also send it down for further processing
which will then eventually end up in ADLS as well. All the 3 HDFS bolts are configured to write to different folder
structures in ADLS. In an ideal scenario I should see all the 3 messages
in ADLS ( raw, out of bolt2 and output of bolt3). But we are noticing
that raw gets written always but sometimes only one of the output (bolt2
or bolt3) gets written to ADLS. Its inconsistent on which one misses.
Sometimes both get written. There aren't any errors/exceptions in log messages. Did anyone run into such issues? Any insight will be appreciated. Are there any good monitoring tools other than Storm UI that gives insight into what is going on? We
are using HDInsight and are hosted on Azure and are using Storm 1.0.1 Thanks.
... View more
Labels:
03-16-2017
11:45 PM
Hi all..We are noticing that there are some messages which get lost during storm processing..below is a brief outline of our pipeline. We have messages coming to Kafka which then get consumed by 2 different kafka spouts in Storm. One Spout writes the message to raw stream and other storm starts processing the message. We need to store the output of Bolt2 to HDFS and also send it down for further processing which will then eventually end up in ADLS as well. All the 3 HDFS bolts are configured to write to different folder structures in ADLS. In an ideal scenario I should see all the 3 messages in ADLS ( raw, out of bolt2 and output of bolt3). But we are noticing that raw gets written always but sometimes only one of the output (bolt2 or bolt3) gets written to ADLS. Its inconsistent on which one misses. Sometimes both get written. There aren't any errors in log messages. Did anyone run into such issues? Any insight will be appreciated. We are using HDInsight and are hosted on Azure and are using Storm 1.0.1 Thanks.
... View more
Labels: