Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Grok Parsing Error for New Metron Telemetry Data Source

avatar
Contributor

Hi,

I'm getting the following error from Storm's ParserBolt in a single node deployment of Metron 0.3.1:

java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} and the parsed message was: {} . Check the pattern at: /apps/metron/patterns/my_telemetry_source on {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} at org.apache.metron.parsers.GrokParser.parse(GrokParser.java:174) at org.apache.metron.parsers.interfaces.MessageParser.parseOptional(MessageParser.java:45) at org.apache.metron.parsers.bolt.ParserBolt.execute(ParserBolt.java:132) at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Grok statement produced a null message. Original message was: {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} and the parsed message was: {} . Check the pattern at: /apps/metron/patterns/my_telemetry_source at org.apache.metron.parsers.GrokParser.parse(GrokParser.java:152) ... 12 more

My input lines are of the following form:

{"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"}
{"hostname":"myhost","timestamp":"1012101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"}

My Grok Pattern is:

MY_TELEMETRY_SOURCE_DELIMITED %{GREEDYDATA:UNWANTED}{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}","cpu_user":"%{NUMBER:cpu_user}","cpu_sys":"%{NUMBER:cpu_sys}","cpu_nice":"%{NUMBER:cpu_nice}","cpu_idle":"%{NUMBER:cpu_idle}","cpu_wait":"%{NUMBER:cpu_wait}","cpu_irq":"%{NUMBER:cpu_irq}","cpu_soft_irq":"%{NUMBER:cpu_soft_irq}","cpu_stolen":"%{NUMBER:cpu_stolen}","load_one":"%{NUMBER:load_one}","load_five":"%{NUMBER:load_five}","load_fifteen":"%{NUMBER:load_fifteen}"}%{GREEDYDATA:UNWANTED}

Which according to the GrokConstructor (http://grokconstructor.appspot.com) matches correctly.

All Metron services are up and the pattern is staged at /apps/metron/patterns/my_telemetry_source as evident by the error log.

My parser configuration for the parsing topology is located at /usr/metron/0.3.1/config/zookeeper/parsers/my_telemetry_source.json and consists of the following:

{
    "parserClassName": "org.apache.metron.parsers.GrokParser",
    "sensorTopic": "my_telemetry_source",
    "parserConfig": {
        "grokPath": "/apps/metron/patterns/my_telemetry_source",        
        "patternLabel": "MY_TELEMETRY_SOURCE_DELIMITED",
        "timestampField": "timestamp"
    }
}

I have tried this setup both with and without applying the patternLabel, MY_TELEMETRY_SOURCE_DELIMITED, getting the same error message in both cases.

I appreciate your help.

1 ACCEPTED SOLUTION

avatar
Contributor

I eventually got around this problem by using the CSV parser instead of grok. I was using metron 3.0 at the time and am not sure if the issue with grok got resolved or not, but I don't have any problem with the CSV parser in the current version , 3.1.

View solution in original post

13 REPLIES 13

avatar
Super Collaborator

Hello H S,

In your Grok pattern, can you try with escaping the curly braces after and before the GREEDYDATA

You could copy-paste the one below:

MY_TELEMETRY_SOURCE_DELIMITED %{GREEDYDATA:UNWANTED}\{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}","cpu_user":"%{NUMBER:cpu_user}","cpu_sys":"%{NUMBER:cpu_sys}","cpu_nice":"%{NUMBER:cpu_nice}","cpu_idle":"%{NUMBER:cpu_idle}","cpu_wait":"%{NUMBER:cpu_wait}","cpu_irq":"%{NUMBER:cpu_irq}","cpu_soft_irq":"%{NUMBER:cpu_soft_irq}","cpu_stolen":"%{NUMBER:cpu_stolen}","load_one":"%{NUMBER:load_one}","load_five":"%{NUMBER:load_five}","load_fifteen":"%{NUMBER:load_fifteen}"\}%{GREEDYDATA:UNWANTED}

Let me know if that works.

Regards,

Anand

avatar
Contributor

Hi @asubramanian,

Thanks for your response. I made your suggested change and I'm not getting the parsing error any more. I can't see the data getting indexed though in elastic search. This is likely unrelated to the parsing error, but I'm not sure. I'm not getting any data corresponding to my telemetry source, but am getting the following error for the snort sensor from hdfsIndexingBolt:

org.apache.hadoop.security.AccessControlException: Permission denied: user=storm, access=WRITE, inode="/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1490726017771.json":metron:metron:drwxrwxr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2598) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2533) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2417) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:729) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) at sun.reflect.GeneratedConstructorAccessor31.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1628) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638) at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448) at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776) at org.apache.metron.writer.hdfs.SourceHandler.createOutputFile(SourceHandler.java:142) at org.apache.metron.writer.hdfs.SourceHandler.initialize(SourceHandler.java:98) at org.apache.metron.writer.hdfs.SourceHandler.<init>(SourceHandler.java:65) at org.apache.metron.writer.hdfs.HdfsWriter.getSourceHandler(HdfsWriter.java:102) at org.apache.metron.writer.hdfs.HdfsWriter.write(HdfsWriter.java:77) at org.apache.metron.writer.BulkWriterComponent.write(BulkWriterComponent.java:138) at org.apache.metron.writer.bolt.BulkMessageWriterBolt.execute(BulkMessageWriterBolt.java:115) at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=storm, access=WRITE, inode="/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1490726017771.json":metron:metron:drwxrwxr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2598) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2533) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2417) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:729) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:1407) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy46.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy48.create(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623) ... 28 more

Do I need to setup permissions for storm user on hdfs?

How do you suggest I test that the data from my telemetry source successfully passes the parsing phase?

Thanks again for your help.

avatar
Super Collaborator

Hi HS, on my box, the permissions on the HDFS folder /apps/metron/indexing/indexed are metron:metron. Can you check if yours is the same?

avatar
Contributor

@asubramanian,

I forgot to add my indexing configuration to the previous post. Here it goes:

{
   "elasticsearch": {
      "index": "my_telemetry_source",
      "batchSize": 5,
         "enabled" : true
   },
   "hdfs": {
      "index": "my_telemetry_source",
      "batchSize": 5,
      "enabled" : true
   }
}

avatar
Contributor

@asubramanian,

Strange... After restarting everything I'm getting the initial parse error again.

avatar
Contributor

@asubramanian

I can confirm that escaping the braces does not wok. Here's a simplified version of my problem:

Telemetry Input (I'm using NiFi with JoltTransform processor to run this test):

{"hostname":"my.machine","timestamp":"1490746284"} 

Grok Pattern 1:

{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}"}

Grok Pattern 2:

\{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}"\}

They both result in the following error in Metron's parser bolt:

java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"my.machine","timestamp":"1490746284"} and the parsed message was: {}

Note that Pattern 1 is a match according to the grok constructor website: http://grokconstructor.appspot.com/

I'm not sure what's going on. Is this a bug in Metron grok parser or am I doing something wrong? I greatly appreciate your help.

avatar
Super Collaborator

Hm.. let me try that on my setup and revert.

avatar
Contributor

I eventually got around this problem by using the CSV parser instead of grok. I was using metron 3.0 at the time and am not sure if the issue with grok got resolved or not, but I don't have any problem with the CSV parser in the current version , 3.1.

avatar
New Contributor

Can you show us please, in steps, how to use CSV parser??