Support Questions
Find answers, ask questions, and share your expertise

Metron indexing not working when tried with large files ( .1 million records ). Where to find proper logs

Metron indexing not working when tried with large files ( .1 million records ). Where to find proper logs

Contributor

I have an ambari cluster with 10 nodes as in the standard deployment architecture documentation with each node having 8 GB of memory. Smaller files indexing works fine and when I tried with large files .1 million records ( CSV file with csv parser / json with grok ) Indexing stops after writing around 90k records. Some issue with kafka indexing topology. How can I properly debug the issue ? I could not find any errors in storm UI except few warnings. I tried looking at the worker log file of storm where I could see below log, Whic I suppose causing the issue.

2018-04-24 06:13:36.787 o.a.s.k.s.i.OffsetManager Thread-17-kafkaSpout-executor[9 9] [WARN] topic-partition [newindexing-0] has unexpected offset [393]. Current committed Offset [1809252]
2018-04-24 06:13:36.787 o.a.s.k.s.i.OffsetManager Thread-17-kafkaSpout-executor[9 9] [WARN] topic-partition [newindexing-0] has unexpected offset [394]. Current committed Offset [1809252]
2018-04-24 06:13:36.787 o.a.s.k.s.i.OffsetManager Thread-17-kafkaSpout-executor[9 9] [WARN] topic-partition [newindexing-0] has unexpected offset [395]. Current committed Offset [1809252]
2018-04-24 06:13:36.787 o.a.s.k.s.i.OffsetManager Thread-17-kafkaSpout-executor[9 9] [WARN] topic-partition [newindexing-0] has unexpected offset [396]. Current committed Offset [1809252]
2018-04-24 06:13:36.787 o.a.s.k.s.i.OffsetManager Thread-17-kafkaSpout-executor[9 9] [WARN] topic-partition [newindexing-0] has unexpected offset [397]. Current committed Offset [1809252]
2018-04-24 06:13:36.787 o.a.s.k.s.i.OffsetManager Thread-17-kafkaSpout-executor[9 9] [WARN] topic-partition [newindexing-0] has unexpected offset [398]. Current committed Offset [1809252]
2018-04-24 06:13:36.800 o.a.s.d.executor Thread-5-kafkaSpout-executor[8 8] [INFO] Deactivating spout kafkaSpout:(8)

Where can I find more details in logs ? I got the id of the topology from storm UI and looked under worker artifacts where the storm topology is running. I also checked the storm supervisor logs where I could see below log lines.

/hdp/current/storm-supervisor/conf:/hadoop/storm/supervisor/stormdist/dnslog-10-1524558325/stormjar.jar:/etc/hbase/conf:/etc/hadoop/conf' 'or$
.apache.storm.daemon.worker' 'dnslog-10-1524558325' 'bbacde45-e453-4322-bdb9-27d41ef269b0' '6705' 'b66ba6e7-08be-486a-a6cc-27ec96ccd877'
2018-04-24 08:25:29.064 o.a.s.config Thread-3 [INFO] SET worker-user b66ba6e7-08be-486a-a6cc-27ec96ccd877 storm
2018-04-24 08:25:29.065 o.a.s.d.supervisor Thread-3 [INFO] Creating symlinks for worker-id: b66ba6e7-08be-486a-a6cc-27ec96ccd877 storm-id: dns$
og-10-1524558325 to its port artifacts directory
2018-04-24 08:25:29.066 o.a.s.d.supervisor Thread-3 [INFO] Creating symlinks for worker-id: b66ba6e7-08be-486a-a6cc-27ec96ccd877 storm-id: dns$
og-10-1524558325 for files(1): ("resources")
2018-04-24 08:25:29.067 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:29.567 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:30.068 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:30.568 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:31.068 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:31.569 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:32.069 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:32.569 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:33.070 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:33.570 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:34.072 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:34.578 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:25:35.081 o.a.s.d.supervisor Thread-3 [INFO] b66ba6e7-08be-486a-a6cc-27ec96ccd877 still hasn't started
2018-04-24 08:29:51.674 o.a.s.c.healthcheck timer [INFO] ()
2018-04-24 08:34:51.675 o.a.s.c.healthcheck timer [INFO] ()
2018-04-24 08:39:51.675 o.a.s.c.healthcheck timer [INFO] ()

For indexing I have 1 worker running with Assaigned 832 MB as per storm UI. If this is some meory resource issue I should get these somewhere in logs right ? Also I have created kafka topics with 4 partition ( Indexing & input topics ).

1 REPLY 1

Re: Metron indexing not working when tried with large files ( .1 million records ). Where to find proper logs

Explorer

I have a same problem.

How do you solve you'r problem?

thank you.