Created 12-01-2017 09:29 PM
Hi,
I have a metron cluster running with data feeds coming in. I was wondering is there a way to verify x number of documents come through kafka for a topic, then parsing, enrichment, and then indexing to ES? The reason why I'm asking is because I don't see as many documents indexed on yesterday and today (for instance) than usually everyday.
Thanks in advance for feedback.
Created 12-04-2017 11:37 AM
Hi Arian,
If you want to get serious about the volumes of processed feeds it is time to start tracking the Kafka offset growth
/usr/hdp/<VERSION>/kafka/bin/kafka-consumer-offset-checker.sh --zookeeper zk_host:2181 --security-protocol SASL_PLAINTEXT --topic parsing --group parsing /usr/hdp/<VERSION>/kafka/bin/kafka-consumer-offset-checker.sh --zookeeper zk_host:2181 --security-protocol SASL_PLAINTEXT --topic enrichments --group enrichments /usr/hdp/<VERSION>/kafka/bin/kafka-consumer-offset-checker.sh --zookeeper zk_host:2181 --security-protocol SASL_PLAINTEXT --topic indexing --group indexing
Just check the growth of the topics in between some test runs (if all topologies run continuously it will be tricky to squeeze out the exact numbers)
In Elastic make sure you set your queries right (what date_stamp is used for index time) to make a fair comparison. Errors while parsing, enrichment or indexing can also explain for some gaps, depending on where you direct those in your Metron config.
The Storm UI's numbers are not that easy to get right so don't waste too much time on those 🙂
Created 12-04-2017 11:37 AM
Hi Arian,
If you want to get serious about the volumes of processed feeds it is time to start tracking the Kafka offset growth
/usr/hdp/<VERSION>/kafka/bin/kafka-consumer-offset-checker.sh --zookeeper zk_host:2181 --security-protocol SASL_PLAINTEXT --topic parsing --group parsing /usr/hdp/<VERSION>/kafka/bin/kafka-consumer-offset-checker.sh --zookeeper zk_host:2181 --security-protocol SASL_PLAINTEXT --topic enrichments --group enrichments /usr/hdp/<VERSION>/kafka/bin/kafka-consumer-offset-checker.sh --zookeeper zk_host:2181 --security-protocol SASL_PLAINTEXT --topic indexing --group indexing
Just check the growth of the topics in between some test runs (if all topologies run continuously it will be tricky to squeeze out the exact numbers)
In Elastic make sure you set your queries right (what date_stamp is used for index time) to make a fair comparison. Errors while parsing, enrichment or indexing can also explain for some gaps, depending on where you direct those in your Metron config.
The Storm UI's numbers are not that easy to get right so don't waste too much time on those 🙂
Created 12-04-2017 02:40 PM
Thank you very much for your response @Jasper
When I tried to run kafka-consumer-offset-checker, it said that it's deprecated and I don't have Jaas configuration in place. A quick look up it's something about single sign on kerberos account? I don't know if I need that but I'll look into it and find out. will be back for more update, thank you!
Created 12-04-2017 02:47 PM
@Arian Trayen just ignore the deprecation message. Kafka project wants to deprecate but the replacement is not complete yet. Anyway you can just ignore that.
If you don't have Kerberos leave out the "--security-protocol SASL_PLAINTEXT" part
Created 12-04-2017 03:59 PM
Thank you so much @Jasper
This is pretty neat! It provides offset, logsize and Lag information. I'm fairly new to Kafka as well and it looks like i'm very behind on indexing. However, from Storm UI, it didn't look like I have that many incoming for indexing topic. I was reading and they say Lag should be close to 0, which would indicate that the system is caught up. How do I get Lag down to 0? Do I need more indexing storm workers?
Offset: 82387393
logSize: 326704262
Lag: 244316869
As always, thank you for your time and response.
Created 12-04-2017 04:31 PM
Look out for any error in the indexing topology logger at
/var/log/storm/worker-artifacts/<indexing-####-1234566>/6701/worker.log
(replace "indexing-#####-1234566" for the real id of your current indexing topo by consulting the Storm UI.)
This will probably reveal the problem. If not you could also run the indexing topology in DEBUG mode for a while (also via StormUI )
Created 12-04-2017 08:05 PM
Thank you @Jasper
I noticed I kept getting the error about fetching an offset out of range. I changed the kafka log retention rule to be shorter b/c I kept getting Out of Space because of pcap ingestion and kafka-log for pcap took all my space. Since I stop ingesting pcap, I reverted back the kafka retention rule and hopefully it won't complain about trying to read an offset that is already wiped out. If this doesn't work, I'll try the DEBUG mode that you suggested. Thank you again for your help!
Fetch offset 82387394 is out of range for partition indexing-0, resetting offset
Created 12-04-2017 08:17 PM
Normally you can overcome this thing by stopping the topology, change the first poll strategy, for 1 restart only, for the topo on Ambari to "EARLIEST" (or "LATEST" if you want) and restart the topology.
(don't forget to put it back to UNCOMMITTED_EARLIEST once the "out-of-range" errors have gone)
Created 12-05-2017 06:44 PM
Thank you very much @Jasper
I was able to do that for the indexing topology, but how do you set that for parsing and enrichment topology?
I still see a large number of failed under indexing topology but nothing obvious in logs. Occasionally, I see kafka coordinate mark and discover dead topic and I don't know how to fix that, but it goes away after awhile
Created 12-04-2017 04:29 PM
FYI...I didn't know the name of some of my kafka consumer groups and I figured out a way to list them all and describe each of them. Hopefully, it would help someone like myself
/usr/hdp/<VERSION>/kafka/bin/kafka-consumer-groups.sh --list --zookeeper <zk_host:port> | while read group; do echo $group; /usr/hdp/<VERSION>/kafka/bin/kafka-consumer-groups.sh --zookeeper <zk_host:port> --describe -group ${group}; done