About schintalapani

schintalapani · ‎12-16-2015

We are releasing 0.9 apache kafka in HDP 2.3.4. We are very close to release.

schintalapani · ‎12-15-2015

spark streaming haven't yet enabled security for their kafka connector

schintalapani · ‎12-14-2015

By default ambari installed kafka binds to host.name of that machine. So you can access the kafka broker based on hostname from any other machines as long as hostname is DNS resolved or add it to your /etc/hosts

schintalapani · ‎12-07-2015

@jniemiec which version of storm are you using. From 0.10 onwards we relocated the dependencies in storm so that user topologies won't run into conflicts like above.

schintalapani · ‎12-02-2015

for JMX params on kafka brokers side you can use kafka-env section in Ambari and restart the brokers. For kafka-produer-perf-test.sh scripts you can pass it via sh

schintalapani · ‎11-06-2015

No. HDP-2.2 storm's kafka spout doesn't support connecting kerberos kafka.

schintalapani · ‎10-26-2015

Storm multi tenancy can be enabled when using security only. In non-secure mode all worker jvms will be running as user "storm" as a single user. Where as in storm secure mode (using kerberos) you can set supervisor.worker.run.as.user to true. This makes worker jvm to run as user who submitted the topology. This will give worker isolation between topologies. More details https://github.com/apache/storm/blob/master/SECURITY.md#run-worker-processes-as-user-who-submitted-the-topology

schintalapani · ‎10-17-2015

kafka producer doesn't need to know about zookeeper cluster. It takes in broker-list as a config option which than used to send topic metadata requests to determine who is the leader of the topic partition

schintalapani · ‎10-17-2015

This is old config file http://kafka.apache.org/07/configuration.html version 0.7 for the new configuration you should look at https://kafka.apache.org/08/configuration.html By using kafka-topics.sh you create how many partitions you would like to have in a topic. Usually this should be determined by how much parallelism you would like to have on the consumer side to read from the topic. Kafka doesn't determine how you distribute the data into the topic partitions. That depends on the producer. As you said if key is null it does a round-robin not random. If the key is provided it does a hash based distribution. All of this happens on the producer side not in kafka broker.

schintalapani · ‎10-16-2015

Make sure you set the following config in kafkaspout's Spoutconfig spoutConfig.startOffsetTime = kafka.api.OffsetRequest.EarliestTime(); https://github.com/apache/storm/tree/master/external/storm-kafka Apart from that 1. Make sure you log.retention.hours is long enough to retain topic data 2. Check kafka topic offsets bin/kafka-run-class.sh kafka.tools.GetOffsetShell --broker-list hostname:6667 --topic topic_name --time -1 the above command will give you latest offset into kafka topic and now you need to check if storm kafkaspout is catching up. 2.1 login into zookeeper shell 2.2 ls /zkroot/id (zkroot is the one configured in spoutconfig and id from spoutconfig) as well 2.3 get /zkroot/id/topic_name/part_0 will give you a json structure with key "offset" this will tell you how far you read into topic and also how far you are behind reading the latest data. If its too far apart and if log.retention.hours hit you kafkaspout might be requesting for older offset which might have been deleted.

Online	Offline
Last Visited	‎05-02-2018 06:25 PM

Member Since	‎09-23-2015 09:21 PM
Last Visited	‎05-02-2018 06:25 PM
Posts	81
Kudos received	109

Cloudera Community

Re: SAM error: com.hortonworks.registries.schemare...

Re: When using storm-kafka version 1.1.0.2.6.0.3-8...

Re: Deploying the trucking demo topology results f...

Re: Authentication and Authorization errors on sim...

Re: Connect Storm to secured Kafka with Kafka Spou...

Re: Hello Wondering if anyone knows the timeframe ...

Re: How to read from a Kafka topic using Spark (st...

Re: Open Kafka to requests from other hosts

Re: Storm JAR Version Conflicts

Re: Kafka Performance Testing - JMX Parameters Inp...

Re: Can an external kerberized kafka cluster (non-...

Re: Storm Multi Tenancy - Best practices

Re: Does Kafka Producer need to know Zookeper list...

Re: Kafka Partitioning Class : Clarification

Re: How to handle kafka.common.OffsetOutOfRangeExc...