Created 10-11-2016 03:00 PM
Am running through the tutorial to add a new telemetry source into Metron and have encountered a problem with the enrichmentJoinBolt in Storm, it is failing to process any of the messages that the Squid topology has process with the below error;
2016-10-11 14:32:09 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Unable to retrieve a sensor enrichment config of squid 2016-10-11 14:32:09 o.a.m.e.b.EnrichmentJoinBolt [ERROR] Unable to retrieve a sensor enrichment config of squid 2016-10-11 14:32:09 o.a.m.e.b.JoinBolt [ERROR] [Metron] Unable to join messages: {"code":0,"method":"GET","enrichmentsplitterbolt.splitter.end.ts":"1476196329341","enrichmentsplitterbolt.splitter.begin.ts":"1476196329341","url":"https:\/\/tfl.gov.uk\/plan-a-journey\/","source.type":"squid","elapsed":31271,"ip_dst_addr":null,"original_string":"1476113538.772 31271 127.0.0.1 TCP_MISS\/000 0 GET https:\/\/tfl.gov.uk\/plan-a-journey\/ - DIRECT\/tfl.gov.uk -","bytes":0,"action":"TCP_MISS","ip_src_addr":"127.0.0.1","timestamp":1476113538772} java.lang.NullPointerException: null at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:76) ~[stormjar.jar:na] at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:33) ~[stormjar.jar:na] at org.apache.metron.enrichment.bolt.JoinBolt.execute(JoinBolt.java:111) ~[stormjar.jar:na] at backtype.storm.daemon.executor$fn__7014$tuple_action_fn__7016.invoke(executor.clj:670) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.daemon.executor$mk_task_receiver$fn__6937.invoke(executor.clj:426) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.disruptor$clojure_handler$reify__6513.onEvent(disruptor.clj:58) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.daemon.executor$fn__7014$fn__7027$fn__7078.invoke(executor.clj:808) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.util$async_loop$fn__545.invoke(util.clj:475) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
I am using the full-dev environment with Metron 0.2.0BETA and the guide, https://cwiki.apache.org/confluence/display/METRON/2016/04/25/Metron+Tutorial+-+Fundamentals+Part+1%...
I can see data in the kibana dashboard from Bro and Yaf, which both also have indexes created in elastic, however there is no index for the squid data.
I tried killing the Storm topologies and re-running ./run_enrichment_role.sh then after this restarting the squid parser topology.
Any help would be greatly appreciated.
Created 10-11-2016 04:43 PM
Check to be sure the enrichment config and parser configs for squid are installed using the zk_load_configs.sh with the -m DUMP method:
For example on quick dev run this command. The parser enrichment configs are in bold:
[vagrant@node1 ~]$ /usr/metron/0.2.0BETA/bin/zk_load_configs.sh -i /usr/metron/0.2.0BETA/config/zookeeper/ -m DUMP -z localhost:2181 | grep -i squid | grep Config
log4j:WARN No appenders could be found for logger (org.apache.curator.framework.imps.CuratorFrameworkImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
PARSER Config: squid
ENRICHMENT Config: squid
If not, check the zookeeper config directory:
[vagrant@node1 ~]$ ls /usr/metron/0.2.0BETA/config/zookeeper/enrichments/
bro.json snort.json squid.json websphere.json yaf.json
Then update zookeeper:
/usr/metron/0.2.0BETA/bin/zk_load_configs.sh -i /usr/metron/0.2.0BETA/config/zookeeper/ -m PUSH -z localhost:2181
Then you will probably need to restart the enrichment topology. From Ambari, go to the storm UI, click into the enrichment topology and then the Kill button. If you are using quick dev, monit should automatically restart.
Created 10-11-2016 04:43 PM
Check to be sure the enrichment config and parser configs for squid are installed using the zk_load_configs.sh with the -m DUMP method:
For example on quick dev run this command. The parser enrichment configs are in bold:
[vagrant@node1 ~]$ /usr/metron/0.2.0BETA/bin/zk_load_configs.sh -i /usr/metron/0.2.0BETA/config/zookeeper/ -m DUMP -z localhost:2181 | grep -i squid | grep Config
log4j:WARN No appenders could be found for logger (org.apache.curator.framework.imps.CuratorFrameworkImpl).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
PARSER Config: squid
ENRICHMENT Config: squid
If not, check the zookeeper config directory:
[vagrant@node1 ~]$ ls /usr/metron/0.2.0BETA/config/zookeeper/enrichments/
bro.json snort.json squid.json websphere.json yaf.json
Then update zookeeper:
/usr/metron/0.2.0BETA/bin/zk_load_configs.sh -i /usr/metron/0.2.0BETA/config/zookeeper/ -m PUSH -z localhost:2181
Then you will probably need to restart the enrichment topology. From Ambari, go to the storm UI, click into the enrichment topology and then the Kill button. If you are using quick dev, monit should automatically restart.
Created 10-11-2016 08:26 PM
Thanks the missing enrichment config was definitely the issue, I copied the yaf.json and amended the index JSON field to squid instead of yaf and then uploaded as per your instructions. Below is the squid.json file I created.
{ "index":"squid", "batchSize": 5, "enrichment" : { "fieldMap": { "geo": ["ip_dst_addr", "ip_src_addr"] } }, "threatIntel": { "fieldMap": { "hbaseThreatIntel": ["ip_src_addr", "ip_dst_addr"] }, "fieldToTypeMap": { "ip_src_addr" : ["malicious_ip"], "ip_dst_addr" : ["malicious_ip"] } } }
However I am now getting another error in the enrichment join bolt as below
2016-10-11 19:59:09 o.a.m.e.b.JoinBolt [ERROR] [Metron] Unable to join messages: {"enrichments.geo.ip_dst_addr":"","adapter.geoadapter.end.ts":"1476215988122","enrichments.geo.ip_src_addr":" ","adapter.geoadapter.begin.ts":"1476215988122","source.type":"squid"}
Created 10-11-2016 08:43 PM
@Aaron Harris Is there any more context to the error log. For example a log entry that says Joining problem?
Also is your geo enrichment topology running cleanly? Check out the Storm UI. Go to the enrichment topology and see if you have any errors with the geo bolt. A common issue is that that geoenrichment bolt requires mysql. If it isn't running the geo bolt will fail.
Check out Michael Young's article : https://community.hortonworks.com/content/kbentry/59801/troubleshooting-missing-events-in-metron-qui...
It is geared toward quick dev but much of it applies to other deployments as well.
Created 10-12-2016 09:39 AM
Thanks for the pointers, I restarted the Metron VM and started the services again and it seems to be up and running now, I can see an index in elastic for the squid data.
My next problem however is that when I try to create an index in elastic for the squid data it can't resolve any Time-field names and the dropdown is blank.
I have inspected the data in the head plugin and there is definitely a "timestamp" field held under "_source" field in the JSON data.
Created 10-12-2016 08:09 PM
Are you following the tutorial below:
After deleting the index, I got a similar error. I generated some more log entries to the squid topic. Then I was able configure the index and the timestamp appeared.
Created 10-13-2016 07:57 AM
Yes that is the tutorial I am following, however I have another problem now with the Storm UI throwing an error, even though Ambari is showing that all the Storm components are running fine.
java.lang.RuntimeException: Could not find leader nimbus from seed hosts ["node1"]. Did you specify a valid list of nimbus hosts for config nimbus.seeds
I have posted it in a separate topic as I think the enrichment issue is now resolved as I can see index data in elastic.
How did you delete the index? Was that through elastic, and was it just a case of pushing the logs to the kafka topic that created the new index and resolved the issue.
Thanks again for all your help, really appreciate it.
Created 10-13-2016 08:40 AM
Fixed the storm issue it was to do with a backup I took of the storm local data directory when I was having problems starting the storm supervisor. I restored the nimbus/stormdist data from my backup and it is started up correctly now.
So next step is to look at the indexing in elasticsearch.
Created 10-13-2016 01:16 PM
>How did you delete the index? Was that through elastic, and was it just a case of pushing the logs to the kafka topic that created the new index and resolved the issue
See Step 3 of the tutorial. You an use the rest service to delete the index:
curl -XDELETE node1:9200/squid*
Created 10-14-2016 09:24 AM
Thanks for that have removed the historical index from elastic so I now have no squid indexes, however I am back to the previous problem with the enrichmentJoinBolt, I have checked and MySQL is running and no errors are showing in the geoEnrichmentBolt, I am also getting data from bro and yaf showing up in elastic.
Have worked through the troubleshooting article and cannot see any problems, the only thing I can think of is that the enrichment config I have provided has something incorrect in it although I cannot see what.
Pasted below is the error in the Storm geo enrichment bolt and a few of the logs either side, there are no other errors in Storm UI.
2016-10-14 09:11:47 b.s.d.executor [INFO] Prepared bolt simpleHBaseEnrichmentBolt:(8) 2016-10-14 09:12:07 o.a.m.e.b.JoinBolt [ERROR] [Metron] Unable to join messages: {"enrichments.geo.ip_dst_addr":"","adapter.geoadapter.end.ts":"1476436327576","enrichments.geo.ip_src_addr":"","adapter.geoadapter.begin.ts":"1476436327576","source.type":"squid"} java.lang.NullPointerException: null at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:76) ~[stormjar.jar:na] at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:33) ~[stormjar.jar:na] at org.apache.metron.enrichment.bolt.JoinBolt.execute(JoinBolt.java:111) ~[stormjar.jar:na] at backtype.storm.daemon.executor$fn__7014$tuple_action_fn__7016.invoke(executor.clj:670) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.daemon.executor$mk_task_receiver$fn__6937.invoke(executor.clj:426) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.disruptor$clojure_handler$reify__6513.onEvent(disruptor.clj:58) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.daemon.executor$fn__7014$fn__7027$fn__7078.invoke(executor.clj:808) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.util$async_loop$fn__545.invoke(util.clj:475) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40] 2016-10-14 09:12:07 b.s.d.executor [ERROR] java.lang.NullPointerException: null at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:76) ~[stormjar.jar:na] at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:33) ~[stormjar.jar:na] at org.apache.metron.enrichment.bolt.JoinBolt.execute(JoinBolt.java:111) ~[stormjar.jar:na] at backtype.storm.daemon.executor$fn__7014$tuple_action_fn__7016.invoke(executor.clj:670) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.daemon.executor$mk_task_receiver$fn__6937.invoke(executor.clj:426) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.disruptor$clojure_handler$reify__6513.onEvent(disruptor.clj:58) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.daemon.executor$fn__7014$fn__7027$fn__7078.invoke(executor.clj:808) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at backtype.storm.util$async_loop$fn__545.invoke(util.clj:475) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557] at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:na] at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40] 2016-10-14 09:12:42 b.s.m.n.Server [INFO] Getting metrics for server on port 6703 2016-10-14 09:12:45 s.k.ZkCoordinator [INFO] Task [1/1] Refreshing partition manager connections 2016-10-14 09:12:45 s.k.DynamicBrokersReader [INFO] Read partition info from zookeeper: GlobalPartitionInformation{partitionMap={0=node1:6667}}