Support Questions
Find answers, ask questions, and share your expertise

Re: Metron - Error enriching squid data in Storm

@Aaron Harris

First check HBase in Ambari to make sure it is green. The threat intelligence enrichments are using hbase.

Another thing to check is the squid log that is sent to kafka. One of the things I found with squid is that if you aren't constantly sending http requests to squid the logs roll over and there are no messages in the latest log. In a production system where squid is routing user http request the log won't be empty. I think you may be running into this problem:

Check the messages going to the squid topic. It looks like they might be missing some information such as the source and dest ips. An easy way to fix this is to do the squid requests again and populate the most recent log.

The squid messages should look something like this:

[vagrant@node1 ~]$ /usr/hdp/2.4.2.0-258/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic squid --from-beginning

{metadata.broker.list=node1:6667, request.timeout.ms=30000, client.id=console-consumer-31722, security.protocol=PLAINTEXT}

1476285641.838 1439 127.0.0.1 TCP_MISS/200 457194 GET http://www.aliexpress.com/af/shoes.html? - DIRECT/104.81.164.40 text/html

1476285642.545 704 127.0.0.1 TCP_MISS/200 40385 GET http://www.help.1and1.co.uk/domains-c40986/transfer-domains-c79878 - DIRECT/212.227.34.3 text/html

1476285644.617 2068 127.0.0.1 TCP_MISS/200 177264 GET http://www.pravda.ru/science/ - DIRECT/185.103.135.90 text/html

Then check the squid messages going to the enrichments topic. They should look something like this:

[vagrant@node1 ~]$ /usr/hdp/2.4.2.0-258/kafka/bin/kafka-console-consumer.sh --zookeeper localhost:2181 --topic enrichments --from-beginning | grep squid

{"full_hostname":"www.aliexpress.com","code":200,"method":"GET","url":"http:\/\/www.aliexpress.com\/af\/shoes.html?","source.type":"squid","elapsed":1439,"ip_dst_addr":"104.81.164.40","original_string":"1476285641.838 1439 127.0.0.1 TCP_MISS\/200 457194 GET http:\/\/www.aliexpress.com\/af\/shoes.html? - DIRECT\/104.81.164.40 text\/html","bytes":457194,"domain_without_subdomains":"aliexpress.com","action":"TCP_MISS","ip_src_addr":"127.0.0.1","timestamp":1476285641838}

{"full_hostname":"www.help.1and1.co.uk","code":200,"method":"GET","url":"http:\/\/www.help.1and1.co.uk\/domains-c40986\/transfer-domains-c79878","source.type":"squid","elapsed":704,"ip_dst_addr":"212.227.34.3","original_string":"1476285642.545 704 127.0.0.1 TCP_MISS\/200 40385 GET http:\/\/www.help.1and1.co.uk\/domains-c40986\/transfer-domains-c79878 - DIRECT\/212.227.34.3 text\/html","bytes":40385,"domain_without_subdomains":"1and1.co.uk","action":"TCP_MISS","ip_src_addr":"127.0.0.1","timestamp":1476285642545}

Re: Metron - Error enriching squid data in Storm

Contributor

@cduby

Thanks for all your help along the way I think I am finally up and running now.

Found the issue with the enrichments, it was that the squid logs I had generated were missing the destination IP address, once I regenerated these, cleared the kafka queues and restarted the topologies the data started flowing through into elastic index.

Then to get around the timestamp issue I had to curl in a template to elastic to create a template for the squid data with the timestamp field specified as a date as below;

curl -XPUT http://node1:9200/_template/squid -d '{"template":"squid*","mappings": {"squid*": {"properties": {"timestamp": { "type": "date" }}}}}'

Re: Metron - Error enriching squid data in Storm

@Aaron Harris Glad you are up and running!