Created 03-24-2017 10:24 PM
Hi,
I'm getting the following error from Storm's ParserBolt in a single node deployment of Metron 0.3.1:
java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} and the parsed message was: {} . Check the pattern at: /apps/metron/patterns/my_telemetry_source on {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} at org.apache.metron.parsers.GrokParser.parse(GrokParser.java:174) at org.apache.metron.parsers.interfaces.MessageParser.parseOptional(MessageParser.java:45) at org.apache.metron.parsers.bolt.ParserBolt.execute(ParserBolt.java:132) at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Grok statement produced a null message. Original message was: {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} and the parsed message was: {} . Check the pattern at: /apps/metron/patterns/my_telemetry_source at org.apache.metron.parsers.GrokParser.parse(GrokParser.java:152) ... 12 more
My input lines are of the following form:
{"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} {"hostname":"myhost","timestamp":"1012101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"}
My Grok Pattern is:
MY_TELEMETRY_SOURCE_DELIMITED %{GREEDYDATA:UNWANTED}{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}","cpu_user":"%{NUMBER:cpu_user}","cpu_sys":"%{NUMBER:cpu_sys}","cpu_nice":"%{NUMBER:cpu_nice}","cpu_idle":"%{NUMBER:cpu_idle}","cpu_wait":"%{NUMBER:cpu_wait}","cpu_irq":"%{NUMBER:cpu_irq}","cpu_soft_irq":"%{NUMBER:cpu_soft_irq}","cpu_stolen":"%{NUMBER:cpu_stolen}","load_one":"%{NUMBER:load_one}","load_five":"%{NUMBER:load_five}","load_fifteen":"%{NUMBER:load_fifteen}"}%{GREEDYDATA:UNWANTED}
Which according to the GrokConstructor (http://grokconstructor.appspot.com) matches correctly.
All Metron services are up and the pattern is staged at /apps/metron/patterns/my_telemetry_source as evident by the error log.
My parser configuration for the parsing topology is located at /usr/metron/0.3.1/config/zookeeper/parsers/my_telemetry_source.json and consists of the following:
{ "parserClassName": "org.apache.metron.parsers.GrokParser", "sensorTopic": "my_telemetry_source", "parserConfig": { "grokPath": "/apps/metron/patterns/my_telemetry_source", "patternLabel": "MY_TELEMETRY_SOURCE_DELIMITED", "timestampField": "timestamp" } }
I have tried this setup both with and without applying the patternLabel, MY_TELEMETRY_SOURCE_DELIMITED, getting the same error message in both cases.
I appreciate your help.
Created 05-26-2017 09:17 PM
I eventually got around this problem by using the CSV parser instead of grok. I was using metron 3.0 at the time and am not sure if the issue with grok got resolved or not, but I don't have any problem with the CSV parser in the current version , 3.1.
Created 03-28-2017 09:21 AM
Hello H S,
In your Grok pattern, can you try with escaping the curly braces after and before the GREEDYDATA
You could copy-paste the one below:
MY_TELEMETRY_SOURCE_DELIMITED %{GREEDYDATA:UNWANTED}\{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}","cpu_user":"%{NUMBER:cpu_user}","cpu_sys":"%{NUMBER:cpu_sys}","cpu_nice":"%{NUMBER:cpu_nice}","cpu_idle":"%{NUMBER:cpu_idle}","cpu_wait":"%{NUMBER:cpu_wait}","cpu_irq":"%{NUMBER:cpu_irq}","cpu_soft_irq":"%{NUMBER:cpu_soft_irq}","cpu_stolen":"%{NUMBER:cpu_stolen}","load_one":"%{NUMBER:load_one}","load_five":"%{NUMBER:load_five}","load_fifteen":"%{NUMBER:load_fifteen}"\}%{GREEDYDATA:UNWANTED}
Let me know if that works.
Regards,
Anand
Created 03-28-2017 06:41 PM
Hi @asubramanian,
Thanks for your response. I made your suggested change and I'm not getting the parsing error any more. I can't see the data getting indexed though in elastic search. This is likely unrelated to the parsing error, but I'm not sure. I'm not getting any data corresponding to my telemetry source, but am getting the following error for the snort sensor from hdfsIndexingBolt:
org.apache.hadoop.security.AccessControlException: Permission denied: user=storm, access=WRITE, inode="/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1490726017771.json":metron:metron:drwxrwxr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2598) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2533) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2417) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:729) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) at sun.reflect.GeneratedConstructorAccessor31.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1628) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638) at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448) at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776) at org.apache.metron.writer.hdfs.SourceHandler.createOutputFile(SourceHandler.java:142) at org.apache.metron.writer.hdfs.SourceHandler.initialize(SourceHandler.java:98) at org.apache.metron.writer.hdfs.SourceHandler.<init>(SourceHandler.java:65) at org.apache.metron.writer.hdfs.HdfsWriter.getSourceHandler(HdfsWriter.java:102) at org.apache.metron.writer.hdfs.HdfsWriter.write(HdfsWriter.java:77) at org.apache.metron.writer.BulkWriterComponent.write(BulkWriterComponent.java:138) at org.apache.metron.writer.bolt.BulkMessageWriterBolt.execute(BulkMessageWriterBolt.java:115) at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=storm, access=WRITE, inode="/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1490726017771.json":metron:metron:drwxrwxr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2598) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2533) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2417) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:729) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:1407) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy46.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy48.create(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623) ... 28 more
Do I need to setup permissions for storm user on hdfs?
How do you suggest I test that the data from my telemetry source successfully passes the parsing phase?
Thanks again for your help.
Created 03-30-2017 10:23 AM
Hi HS, on my box, the permissions on the HDFS folder /apps/metron/indexing/indexed are metron:metron. Can you check if yours is the same?
Created 03-28-2017 06:46 PM
I forgot to add my indexing configuration to the previous post. Here it goes:
{ "elasticsearch": { "index": "my_telemetry_source", "batchSize": 5, "enabled" : true }, "hdfs": { "index": "my_telemetry_source", "batchSize": 5, "enabled" : true } }
Created 03-28-2017 10:43 PM
Strange... After restarting everything I'm getting the initial parse error again.
Created 03-29-2017 12:23 AM
I can confirm that escaping the braces does not wok. Here's a simplified version of my problem:
Telemetry Input (I'm using NiFi with JoltTransform processor to run this test):
{"hostname":"my.machine","timestamp":"1490746284"}
Grok Pattern 1:
{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}"}
Grok Pattern 2:
\{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}"\}
They both result in the following error in Metron's parser bolt:
java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"my.machine","timestamp":"1490746284"} and the parsed message was: {}
Note that Pattern 1 is a match according to the grok constructor website: http://grokconstructor.appspot.com/
I'm not sure what's going on. Is this a bug in Metron grok parser or am I doing something wrong? I greatly appreciate your help.
Created 03-30-2017 10:24 AM
Hm.. let me try that on my setup and revert.
Created 05-26-2017 09:17 PM
I eventually got around this problem by using the CSV parser instead of grok. I was using metron 3.0 at the time and am not sure if the issue with grok got resolved or not, but I don't have any problem with the CSV parser in the current version , 3.1.
Created 08-15-2017 08:25 AM
Can you show us please, in steps, how to use CSV parser??