Created 03-24-2017 10:24 PM
Hi,
I'm getting the following error from Storm's ParserBolt in a single node deployment of Metron 0.3.1:
java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} and the parsed message was: {} . Check the pattern at: /apps/metron/patterns/my_telemetry_source on {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} at org.apache.metron.parsers.GrokParser.parse(GrokParser.java:174) at org.apache.metron.parsers.interfaces.MessageParser.parseOptional(MessageParser.java:45) at org.apache.metron.parsers.bolt.ParserBolt.execute(ParserBolt.java:132) at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Grok statement produced a null message. Original message was: {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} and the parsed message was: {} . Check the pattern at: /apps/metron/patterns/my_telemetry_source at org.apache.metron.parsers.GrokParser.parse(GrokParser.java:152) ... 12 more
My input lines are of the following form:
{"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} {"hostname":"myhost","timestamp":"1012101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"}
My Grok Pattern is:
MY_TELEMETRY_SOURCE_DELIMITED %{GREEDYDATA:UNWANTED}{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}","cpu_user":"%{NUMBER:cpu_user}","cpu_sys":"%{NUMBER:cpu_sys}","cpu_nice":"%{NUMBER:cpu_nice}","cpu_idle":"%{NUMBER:cpu_idle}","cpu_wait":"%{NUMBER:cpu_wait}","cpu_irq":"%{NUMBER:cpu_irq}","cpu_soft_irq":"%{NUMBER:cpu_soft_irq}","cpu_stolen":"%{NUMBER:cpu_stolen}","load_one":"%{NUMBER:load_one}","load_five":"%{NUMBER:load_five}","load_fifteen":"%{NUMBER:load_fifteen}"}%{GREEDYDATA:UNWANTED}
Which according to the GrokConstructor (http://grokconstructor.appspot.com) matches correctly.
All Metron services are up and the pattern is staged at /apps/metron/patterns/my_telemetry_source as evident by the error log.
My parser configuration for the parsing topology is located at /usr/metron/0.3.1/config/zookeeper/parsers/my_telemetry_source.json and consists of the following:
{ "parserClassName": "org.apache.metron.parsers.GrokParser", "sensorTopic": "my_telemetry_source", "parserConfig": { "grokPath": "/apps/metron/patterns/my_telemetry_source", "patternLabel": "MY_TELEMETRY_SOURCE_DELIMITED", "timestampField": "timestamp" } }
I have tried this setup both with and without applying the patternLabel, MY_TELEMETRY_SOURCE_DELIMITED, getting the same error message in both cases.
I appreciate your help.
Created 05-26-2017 09:17 PM
I eventually got around this problem by using the CSV parser instead of grok. I was using metron 3.0 at the time and am not sure if the issue with grok got resolved or not, but I don't have any problem with the CSV parser in the current version , 3.1.
Created 03-28-2017 09:21 AM
Hello H S,
In your Grok pattern, can you try with escaping the curly braces after and before the GREEDYDATA
You could copy-paste the one below:
MY_TELEMETRY_SOURCE_DELIMITED %{GREEDYDATA:UNWANTED}\{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}","cpu_user":"%{NUMBER:cpu_user}","cpu_sys":"%{NUMBER:cpu_sys}","cpu_nice":"%{NUMBER:cpu_nice}","cpu_idle":"%{NUMBER:cpu_idle}","cpu_wait":"%{NUMBER:cpu_wait}","cpu_irq":"%{NUMBER:cpu_irq}","cpu_soft_irq":"%{NUMBER:cpu_soft_irq}","cpu_stolen":"%{NUMBER:cpu_stolen}","load_one":"%{NUMBER:load_one}","load_five":"%{NUMBER:load_five}","load_fifteen":"%{NUMBER:load_fifteen}"\}%{GREEDYDATA:UNWANTED}
Let me know if that works.
Regards,
Anand
Created 03-28-2017 06:41 PM
Hi @asubramanian,
Thanks for your response. I made your suggested change and I'm not getting the parsing error any more. I can't see the data getting indexed though in elastic search. This is likely unrelated to the parsing error, but I'm not sure. I'm not getting any data corresponding to my telemetry source, but am getting the following error for the snort sensor from hdfsIndexingBolt:
org.apache.hadoop.security.AccessControlException: Permission denied: user=storm, access=WRITE, inode="/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1490726017771.json":metron:metron:drwxrwxr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2598) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2533) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2417) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:729) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) at sun.reflect.GeneratedConstructorAccessor31.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1628) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638) at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448) at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776) at org.apache.metron.writer.hdfs.SourceHandler.createOutputFile(SourceHandler.java:142) at org.apache.metron.writer.hdfs.SourceHandler.initialize(SourceHandler.java:98) at org.apache.metron.writer.hdfs.SourceHandler.<init>(SourceHandler.java:65) at org.apache.metron.writer.hdfs.HdfsWriter.getSourceHandler(HdfsWriter.java:102) at org.apache.metron.writer.hdfs.HdfsWriter.write(HdfsWriter.java:77) at org.apache.metron.writer.BulkWriterComponent.write(BulkWriterComponent.java:138) at org.apache.metron.writer.bolt.BulkMessageWriterBolt.execute(BulkMessageWriterBolt.java:115) at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=storm, access=WRITE, inode="/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1490726017771.json":metron:metron:drwxrwxr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2598) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2533) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2417) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:729) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:1407) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy46.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy48.create(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623) ... 28 more
Do I need to setup permissions for storm user on hdfs?
How do you suggest I test that the data from my telemetry source successfully passes the parsing phase?
Thanks again for your help.
Created 03-30-2017 10:23 AM
Hi HS, on my box, the permissions on the HDFS folder /apps/metron/indexing/indexed are metron:metron. Can you check if yours is the same?
Created 03-28-2017 06:46 PM
I forgot to add my indexing configuration to the previous post. Here it goes:
{ "elasticsearch": { "index": "my_telemetry_source", "batchSize": 5, "enabled" : true }, "hdfs": { "index": "my_telemetry_source", "batchSize": 5, "enabled" : true } }
Created 03-28-2017 10:43 PM
Strange... After restarting everything I'm getting the initial parse error again.
Created 03-29-2017 12:23 AM
I can confirm that escaping the braces does not wok. Here's a simplified version of my problem:
Telemetry Input (I'm using NiFi with JoltTransform processor to run this test):
{"hostname":"my.machine","timestamp":"1490746284"}
Grok Pattern 1:
{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}"}
Grok Pattern 2:
\{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}"\}
They both result in the following error in Metron's parser bolt:
java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"my.machine","timestamp":"1490746284"} and the parsed message was: {}
Note that Pattern 1 is a match according to the grok constructor website: http://grokconstructor.appspot.com/
I'm not sure what's going on. Is this a bug in Metron grok parser or am I doing something wrong? I greatly appreciate your help.
Created 03-30-2017 10:24 AM
Hm.. let me try that on my setup and revert.
Created 05-26-2017 09:17 PM
I eventually got around this problem by using the CSV parser instead of grok. I was using metron 3.0 at the time and am not sure if the issue with grok got resolved or not, but I don't have any problem with the CSV parser in the current version , 3.1.
Created 08-15-2017 08:25 AM
Can you show us please, in steps, how to use CSV parser??
Created 07-20-2017 01:07 PM
Hello @H S,
Have you find a solution for Grok parsing Error :
java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"my.machine","timestamp":"1490746284"} and the parsed message was: {}
I actually have the same error with a very simple Grok pattern :
PARSE_TXT %{DATA:email}\:%{DATA:name}
For the moment, i just try to parse the following text :
toto:toto tata:tata titi:titi
I have tested this on the grok constructor website: http://grokconstructor.appspot.com/ and it correctly match.
Anyone already had this error ?
My parser configuration :
{ "parserClassName": "org.apache.metron.parsers.GrokParser", "sensorTopic": "parse_txt", "parserConfig": { "grokPath": "/apps/metron/patterns/parse_txt", "patternLabel": "PARSE_TXT" } }
Thanks !
Created 07-20-2017 05:33 PM
No, I didn't find the root cause. But in my case, it was fairly easy to use the CSV parser instead of the Grok parser, with slight modifications to my telemetry source. You can find more details about the CSV parser here:
https://metron.apache.org/current-book/metron-platform/metron-parsers/index.html
Hope this helps.
Created 07-24-2017 09:08 AM
Hi @H S
Thanks for your answer, i have increased resources on my Full Dev VM and used Metron UI to create new sensor, currently it works.
But i have a new problem, Elasticsearch not index logs. I don't know if logs correctly reach Elasticsearch because in Storm UI, in parserBolt -> Output stats (for my topic), i don't have anythings. (Screen Below).
Has anyone ever had a problem with Kafka Stream ? Where logs correctly pass thought NiFi, correctly receive by Kafka and parse by Grok but are not redirect to Elasticsearch.
Created 04-06-2018 07:39 PM
I'm having a similar issue. In my case I'm feeding Yaf data to Kafka (it works correctly). Then it gives the parsing error:
java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was:
I used the Metron's default YAF Grok pattern. I have checked the logs in http://grokconstructor.appspot.com and it is able to correctly match the pattern. Furthermore, I use the "Test" button on Metron's UI to check whether the patter is able to extract the fields from the log and it also works correctly.
Does anyone knows why this fails?
Example Log:
2018-04-04 10:05:32.533|2018-04-04 10:05:32.533| 0.000| 0.000| 6| 10.5.0.83| 2869| 10.5.0.97|52023| AR| 0| 0| 0|c019c454|00000000|000|000| 1| 40| 0| 0| 0|
Yaf Pattern
YAF_TIME_FORMAT %{YEAR:UNWANTED}-%{MONTHNUM:UNWANTED}-%{MONTHDAY:UNWANTED}[T ]%{HOUR:UNWANTED}:%{MINUTE:UNWANTED}:%{SECOND:UNWANTED}
YAF_DELIMITED %{YAF_TIME_FORMAT:start_time}\|%{YAF_TIME_FORMAT:end_time}\|%{SPACE:UNWANTED}%{BASE10NUM:duration}\|%{SPACE:UNWANTED}%{BASE10NUM:rtt}\|%{SPACE:UNWANTED}%{INT:protocol}\|%{SPACE:UNWANTED}%{IP:ip_src_addr}\|%{SPACE:UNWANTED}%{INT:ip_src_port}\|%{SPACE:UNWANTED}%{IP:ip_dst_addr}\|%{SPACE:UNWANTED}%{INT:ip_dst_port}\|%{SPACE:UNWANTED}%{DATA:iflags}\|%{SPACE:UNWANTED}%{DATA:uflags}\|%{SPACE:UNWANTED}%{DATA:riflags}\|%{SPACE:UNWANTED}%{DATA:ruflags}\|%{SPACE:UNWANTED}%{WORD:isn}\|%{SPACE:UNWANTED}%{DATA:risn}\|%{SPACE:UNWANTED}%{DATA:tag}\|%{GREEDYDATA:rtag}\|%{SPACE:UNWANTED}%{INT:pkt}\|%{SPACE:UNWANTED}%{INT:oct}\|%{SPACE:UNWANTED}%{INT:rpkt}\|%{SPACE:UNWANTED}%{INT:roct}\|%{SPACE:UNWANTED}%{INT:app}\|%{GREEDYDATA:end_reason}