Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Metron - Error enriching squid data in Storm

avatar
Contributor

Am running through the tutorial to add a new telemetry source into Metron and have encountered a problem with the enrichmentJoinBolt in Storm, it is failing to process any of the messages that the Squid topology has process with the below error;

2016-10-11 14:32:09 o.a.m.e.b.EnrichmentSplitterBolt [ERROR] Unable to retrieve a sensor enrichment config of squid
2016-10-11 14:32:09 o.a.m.e.b.EnrichmentJoinBolt [ERROR] Unable to retrieve a sensor enrichment config of squid
2016-10-11 14:32:09 o.a.m.e.b.JoinBolt [ERROR] [Metron] Unable to join messages: {"code":0,"method":"GET","enrichmentsplitterbolt.splitter.end.ts":"1476196329341","enrichmentsplitterbolt.splitter.begin.ts":"1476196329341","url":"https:\/\/tfl.gov.uk\/plan-a-journey\/","source.type":"squid","elapsed":31271,"ip_dst_addr":null,"original_string":"1476113538.772  31271 127.0.0.1 TCP_MISS\/000 0 GET https:\/\/tfl.gov.uk\/plan-a-journey\/ - DIRECT\/tfl.gov.uk -","bytes":0,"action":"TCP_MISS","ip_src_addr":"127.0.0.1","timestamp":1476113538772}
java.lang.NullPointerException: null
	at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:76) ~[stormjar.jar:na]
	at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:33) ~[stormjar.jar:na]
	at org.apache.metron.enrichment.bolt.JoinBolt.execute(JoinBolt.java:111) ~[stormjar.jar:na]
	at backtype.storm.daemon.executor$fn__7014$tuple_action_fn__7016.invoke(executor.clj:670) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.daemon.executor$mk_task_receiver$fn__6937.invoke(executor.clj:426) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.disruptor$clojure_handler$reify__6513.onEvent(disruptor.clj:58) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.daemon.executor$fn__7014$fn__7027$fn__7078.invoke(executor.clj:808) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.util$async_loop$fn__545.invoke(util.clj:475) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:na]
	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]

I am using the full-dev environment with Metron 0.2.0BETA and the guide, https://cwiki.apache.org/confluence/display/METRON/2016/04/25/Metron+Tutorial+-+Fundamentals+Part+1%...

I can see data in the kibana dashboard from Bro and Yaf, which both also have indexes created in elastic, however there is no index for the squid data.

I tried killing the Storm topologies and re-running ./run_enrichment_role.sh then after this restarting the squid parser topology.

Any help would be greatly appreciated.

1 ACCEPTED SOLUTION

avatar

@Aaron Harris

Check to be sure the enrichment config and parser configs for squid are installed using the zk_load_configs.sh with the -m DUMP method:

For example on quick dev run this command. The parser enrichment configs are in bold:

[vagrant@node1 ~]$ /usr/metron/0.2.0BETA/bin/zk_load_configs.sh -i /usr/metron/0.2.0BETA/config/zookeeper/ -m DUMP -z localhost:2181 | grep -i squid | grep Config

log4j:WARN No appenders could be found for logger (org.apache.curator.framework.imps.CuratorFrameworkImpl).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

PARSER Config: squid

ENRICHMENT Config: squid

If not, check the zookeeper config directory:

[vagrant@node1 ~]$ ls /usr/metron/0.2.0BETA/config/zookeeper/enrichments/

bro.json snort.json squid.json websphere.json yaf.json

Then update zookeeper:

/usr/metron/0.2.0BETA/bin/zk_load_configs.sh -i /usr/metron/0.2.0BETA/config/zookeeper/ -m PUSH -z localhost:2181

Then you will probably need to restart the enrichment topology. From Ambari, go to the storm UI, click into the enrichment topology and then the Kill button. If you are using quick dev, monit should automatically restart.

View solution in original post

12 REPLIES 12

avatar

@Aaron Harris

Check to be sure the enrichment config and parser configs for squid are installed using the zk_load_configs.sh with the -m DUMP method:

For example on quick dev run this command. The parser enrichment configs are in bold:

[vagrant@node1 ~]$ /usr/metron/0.2.0BETA/bin/zk_load_configs.sh -i /usr/metron/0.2.0BETA/config/zookeeper/ -m DUMP -z localhost:2181 | grep -i squid | grep Config

log4j:WARN No appenders could be found for logger (org.apache.curator.framework.imps.CuratorFrameworkImpl).

log4j:WARN Please initialize the log4j system properly.

log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.

PARSER Config: squid

ENRICHMENT Config: squid

If not, check the zookeeper config directory:

[vagrant@node1 ~]$ ls /usr/metron/0.2.0BETA/config/zookeeper/enrichments/

bro.json snort.json squid.json websphere.json yaf.json

Then update zookeeper:

/usr/metron/0.2.0BETA/bin/zk_load_configs.sh -i /usr/metron/0.2.0BETA/config/zookeeper/ -m PUSH -z localhost:2181

Then you will probably need to restart the enrichment topology. From Ambari, go to the storm UI, click into the enrichment topology and then the Kill button. If you are using quick dev, monit should automatically restart.

avatar
Contributor

@cduby

Thanks the missing enrichment config was definitely the issue, I copied the yaf.json and amended the index JSON field to squid instead of yaf and then uploaded as per your instructions. Below is the squid.json file I created.

{
  "index":"squid",
  "batchSize": 5,
  "enrichment" : {
    "fieldMap":
      {
      "geo": ["ip_dst_addr", "ip_src_addr"]
    }
  },
  "threatIntel": {
    "fieldMap":
      {
      "hbaseThreatIntel": ["ip_src_addr", "ip_dst_addr"]
    },
    "fieldToTypeMap":
      {
      "ip_src_addr" : ["malicious_ip"],
      "ip_dst_addr" : ["malicious_ip"]
    }
  }
}

However I am now getting another error in the enrichment join bolt as below

2016-10-11 19:59:09 o.a.m.e.b.JoinBolt [ERROR] [Metron] Unable to join messages: {"enrichments.geo.ip_dst_addr":"","adapter.geoadapter.end.ts":"1476215988122","enrichments.geo.ip_src_addr":"
","adapter.geoadapter.begin.ts":"1476215988122","source.type":"squid"}

avatar

@Aaron Harris Is there any more context to the error log. For example a log entry that says Joining problem?

Also is your geo enrichment topology running cleanly? Check out the Storm UI. Go to the enrichment topology and see if you have any errors with the geo bolt. A common issue is that that geoenrichment bolt requires mysql. If it isn't running the geo bolt will fail.

Check out Michael Young's article : https://community.hortonworks.com/content/kbentry/59801/troubleshooting-missing-events-in-metron-qui...

It is geared toward quick dev but much of it applies to other deployments as well.

avatar
Contributor

@cduby

Thanks for the pointers, I restarted the Metron VM and started the services again and it seems to be up and running now, I can see an index in elastic for the squid data.

My next problem however is that when I try to create an index in elastic for the squid data it can't resolve any Time-field names and the dropdown is blank.

I have inspected the data in the head plugin and there is definitely a "timestamp" field held under "_source" field in the JSON data.

avatar
@Aaron Harris

Are you following the tutorial below:

https://cwiki.apache.org/confluence/display/METRON/2016/06/22/Metron+Tutorial+-+Fundamentals+Part+7%...

After deleting the index, I got a similar error. I generated some more log entries to the squid topic. Then I was able configure the index and the timestamp appeared.

avatar
Contributor

@cduby

Yes that is the tutorial I am following, however I have another problem now with the Storm UI throwing an error, even though Ambari is showing that all the Storm components are running fine.

java.lang.RuntimeException: Could not find leader nimbus from seed hosts ["node1"]. Did you specify a valid list of nimbus hosts for config nimbus.seeds

I have posted it in a separate topic as I think the enrichment issue is now resolved as I can see index data in elastic.

How did you delete the index? Was that through elastic, and was it just a case of pushing the logs to the kafka topic that created the new index and resolved the issue.

Thanks again for all your help, really appreciate it.

avatar
Contributor

Fixed the storm issue it was to do with a backup I took of the storm local data directory when I was having problems starting the storm supervisor. I restored the nimbus/stormdist data from my backup and it is started up correctly now.

So next step is to look at the indexing in elasticsearch.

avatar

@Aaron Harris

>How did you delete the index? Was that through elastic, and was it just a case of pushing the logs to the kafka topic that created the new index and resolved the issue

See Step 3 of the tutorial. You an use the rest service to delete the index:

curl -XDELETE node1:9200/squid*

avatar
Contributor

@cduby

Thanks for that have removed the historical index from elastic so I now have no squid indexes, however I am back to the previous problem with the enrichmentJoinBolt, I have checked and MySQL is running and no errors are showing in the geoEnrichmentBolt, I am also getting data from bro and yaf showing up in elastic.

Have worked through the troubleshooting article and cannot see any problems, the only thing I can think of is that the enrichment config I have provided has something incorrect in it although I cannot see what.

Pasted below is the error in the Storm geo enrichment bolt and a few of the logs either side, there are no other errors in Storm UI.

2016-10-14 09:11:47 b.s.d.executor [INFO] Prepared bolt simpleHBaseEnrichmentBolt:(8)
2016-10-14 09:12:07 o.a.m.e.b.JoinBolt [ERROR] [Metron] Unable to join messages: {"enrichments.geo.ip_dst_addr":"","adapter.geoadapter.end.ts":"1476436327576","enrichments.geo.ip_src_addr":"","adapter.geoadapter.begin.ts":"1476436327576","source.type":"squid"}
java.lang.NullPointerException: null
	at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:76) ~[stormjar.jar:na]
	at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:33) ~[stormjar.jar:na]
	at org.apache.metron.enrichment.bolt.JoinBolt.execute(JoinBolt.java:111) ~[stormjar.jar:na]
	at backtype.storm.daemon.executor$fn__7014$tuple_action_fn__7016.invoke(executor.clj:670) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.daemon.executor$mk_task_receiver$fn__6937.invoke(executor.clj:426) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.disruptor$clojure_handler$reify__6513.onEvent(disruptor.clj:58) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.daemon.executor$fn__7014$fn__7027$fn__7078.invoke(executor.clj:808) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.util$async_loop$fn__545.invoke(util.clj:475) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:na]
	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
2016-10-14 09:12:07 b.s.d.executor [ERROR] 
java.lang.NullPointerException: null
	at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:76) ~[stormjar.jar:na]
	at org.apache.metron.enrichment.bolt.EnrichmentJoinBolt.joinMessages(EnrichmentJoinBolt.java:33) ~[stormjar.jar:na]
	at org.apache.metron.enrichment.bolt.JoinBolt.execute(JoinBolt.java:111) ~[stormjar.jar:na]
	at backtype.storm.daemon.executor$fn__7014$tuple_action_fn__7016.invoke(executor.clj:670) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.daemon.executor$mk_task_receiver$fn__6937.invoke(executor.clj:426) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.disruptor$clojure_handler$reify__6513.onEvent(disruptor.clj:58) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:125) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:99) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:80) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.daemon.executor$fn__7014$fn__7027$fn__7078.invoke(executor.clj:808) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at backtype.storm.util$async_loop$fn__545.invoke(util.clj:475) [storm-core-0.10.0.2.3.0.0-2557.jar:0.10.0.2.3.0.0-2557]
	at clojure.lang.AFn.run(AFn.java:22) [clojure-1.6.0.jar:na]
	at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
2016-10-14 09:12:42 b.s.m.n.Server [INFO] Getting metrics for server on port 6703
2016-10-14 09:12:45 s.k.ZkCoordinator [INFO] Task [1/1] Refreshing partition manager connections
2016-10-14 09:12:45 s.k.DynamicBrokersReader [INFO] Read partition info from zookeeper: GlobalPartitionInformation{partitionMap={0=node1:6667}}