Support Questions

Find answers, ask questions, and share your expertise
Announcements
Welcome to the upgraded Community! Read this blog to see What’s New!

Grok Parsing Error for New Metron Telemetry Data Source

avatar
Explorer

Hi,

I'm getting the following error from Storm's ParserBolt in a single node deployment of Metron 0.3.1:

java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} and the parsed message was: {} . Check the pattern at: /apps/metron/patterns/my_telemetry_source on {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} at org.apache.metron.parsers.GrokParser.parse(GrokParser.java:174) at org.apache.metron.parsers.interfaces.MessageParser.parseOptional(MessageParser.java:45) at org.apache.metron.parsers.bolt.ParserBolt.execute(ParserBolt.java:132) at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.RuntimeException: Grok statement produced a null message. Original message was: {"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"} and the parsed message was: {} . Check the pattern at: /apps/metron/patterns/my_telemetry_source at org.apache.metron.parsers.GrokParser.parse(GrokParser.java:152) ... 12 more

My input lines are of the following form:

{"hostname":"myhost","timestamp":"1010101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"}
{"hostname":"myhost","timestamp":"1012101010","cpu_user":"8.522706","cpu_sys":"29.210946","cpu_nice":"0.000000","cpu_idle":"62.266348","cpu_wait":"0.000000","cpu_irq":"0.000000","cpu_soft_irq":"0.000000","cpu_stolen":"0.000000","load_one":"4.945312","load_five":"4.391113","load_fifteen":"4.385254"}

My Grok Pattern is:

MY_TELEMETRY_SOURCE_DELIMITED %{GREEDYDATA:UNWANTED}{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}","cpu_user":"%{NUMBER:cpu_user}","cpu_sys":"%{NUMBER:cpu_sys}","cpu_nice":"%{NUMBER:cpu_nice}","cpu_idle":"%{NUMBER:cpu_idle}","cpu_wait":"%{NUMBER:cpu_wait}","cpu_irq":"%{NUMBER:cpu_irq}","cpu_soft_irq":"%{NUMBER:cpu_soft_irq}","cpu_stolen":"%{NUMBER:cpu_stolen}","load_one":"%{NUMBER:load_one}","load_five":"%{NUMBER:load_five}","load_fifteen":"%{NUMBER:load_fifteen}"}%{GREEDYDATA:UNWANTED}

Which according to the GrokConstructor (http://grokconstructor.appspot.com) matches correctly.

All Metron services are up and the pattern is staged at /apps/metron/patterns/my_telemetry_source as evident by the error log.

My parser configuration for the parsing topology is located at /usr/metron/0.3.1/config/zookeeper/parsers/my_telemetry_source.json and consists of the following:

{
    "parserClassName": "org.apache.metron.parsers.GrokParser",
    "sensorTopic": "my_telemetry_source",
    "parserConfig": {
        "grokPath": "/apps/metron/patterns/my_telemetry_source",        
        "patternLabel": "MY_TELEMETRY_SOURCE_DELIMITED",
        "timestampField": "timestamp"
    }
}

I have tried this setup both with and without applying the patternLabel, MY_TELEMETRY_SOURCE_DELIMITED, getting the same error message in both cases.

I appreciate your help.

1 ACCEPTED SOLUTION

avatar
Explorer

I eventually got around this problem by using the CSV parser instead of grok. I was using metron 3.0 at the time and am not sure if the issue with grok got resolved or not, but I don't have any problem with the CSV parser in the current version , 3.1.

View solution in original post

13 REPLIES 13

avatar
Super Collaborator

Hello H S,

In your Grok pattern, can you try with escaping the curly braces after and before the GREEDYDATA

You could copy-paste the one below:

MY_TELEMETRY_SOURCE_DELIMITED %{GREEDYDATA:UNWANTED}\{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}","cpu_user":"%{NUMBER:cpu_user}","cpu_sys":"%{NUMBER:cpu_sys}","cpu_nice":"%{NUMBER:cpu_nice}","cpu_idle":"%{NUMBER:cpu_idle}","cpu_wait":"%{NUMBER:cpu_wait}","cpu_irq":"%{NUMBER:cpu_irq}","cpu_soft_irq":"%{NUMBER:cpu_soft_irq}","cpu_stolen":"%{NUMBER:cpu_stolen}","load_one":"%{NUMBER:load_one}","load_five":"%{NUMBER:load_five}","load_fifteen":"%{NUMBER:load_fifteen}"\}%{GREEDYDATA:UNWANTED}

Let me know if that works.

Regards,

Anand

avatar
Explorer

Hi @asubramanian,

Thanks for your response. I made your suggested change and I'm not getting the parsing error any more. I can't see the data getting indexed though in elastic search. This is likely unrelated to the parsing error, but I'm not sure. I'm not getting any data corresponding to my telemetry source, but am getting the following error for the snort sensor from hdfsIndexingBolt:

org.apache.hadoop.security.AccessControlException: Permission denied: user=storm, access=WRITE, inode="/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1490726017771.json":metron:metron:drwxrwxr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2598) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2533) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2417) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:729) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) at sun.reflect.GeneratedConstructorAccessor31.newInstance(Unknown Source) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106) at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1628) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1703) at org.apache.hadoop.hdfs.DFSClient.create(DFSClient.java:1638) at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:448) at org.apache.hadoop.hdfs.DistributedFileSystem$7.doCall(DistributedFileSystem.java:444) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:459) at org.apache.hadoop.hdfs.DistributedFileSystem.create(DistributedFileSystem.java:387) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:909) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:890) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:787) at org.apache.hadoop.fs.FileSystem.create(FileSystem.java:776) at org.apache.metron.writer.hdfs.SourceHandler.createOutputFile(SourceHandler.java:142) at org.apache.metron.writer.hdfs.SourceHandler.initialize(SourceHandler.java:98) at org.apache.metron.writer.hdfs.SourceHandler.<init>(SourceHandler.java:65) at org.apache.metron.writer.hdfs.HdfsWriter.getSourceHandler(HdfsWriter.java:102) at org.apache.metron.writer.hdfs.HdfsWriter.write(HdfsWriter.java:77) at org.apache.metron.writer.BulkWriterComponent.write(BulkWriterComponent.java:138) at org.apache.metron.writer.bolt.BulkMessageWriterBolt.execute(BulkMessageWriterBolt.java:115) at org.apache.storm.daemon.executor$fn__6573$tuple_action_fn__6575.invoke(executor.clj:734) at org.apache.storm.daemon.executor$mk_task_receiver$fn__6494.invoke(executor.clj:466) at org.apache.storm.disruptor$clojure_handler$reify__6007.onEvent(disruptor.clj:40) at org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:451) at org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:430) at org.apache.storm.disruptor$consume_batch_when_available.invoke(disruptor.clj:73) at org.apache.storm.daemon.executor$fn__6573$fn__6586$fn__6639.invoke(executor.clj:853) at org.apache.storm.util$async_loop$fn__554.invoke(util.clj:484) at clojure.lang.AFn.run(AFn.java:22) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=storm, access=WRITE, inode="/apps/metron/indexing/indexed/snort/enrichment-null-0-0-1490726017771.json":metron:metron:drwxrwxr-x at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:319) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:292) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:213) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:190) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1827) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1811) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkAncestorAccess(FSDirectory.java:1794) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInternal(FSNamesystem.java:2598) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFileInt(FSNamesystem.java:2533) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.startFile(FSNamesystem.java:2417) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.create(NameNodeRpcServer.java:729) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.create(ClientNamenodeProtocolServerSideTranslatorPB.java:405) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:640) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2313) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2309) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1724) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2307) at org.apache.hadoop.ipc.Client.call(Client.java:1476) at org.apache.hadoop.ipc.Client.call(Client.java:1407) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:229) at com.sun.proxy.$Proxy46.create(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.create(ClientNamenodeProtocolTranslatorPB.java:296) at sun.reflect.GeneratedMethodAccessor46.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) at com.sun.proxy.$Proxy48.create(Unknown Source) at org.apache.hadoop.hdfs.DFSOutputStream.newStreamForCreate(DFSOutputStream.java:1623) ... 28 more

Do I need to setup permissions for storm user on hdfs?

How do you suggest I test that the data from my telemetry source successfully passes the parsing phase?

Thanks again for your help.

avatar
Super Collaborator

Hi HS, on my box, the permissions on the HDFS folder /apps/metron/indexing/indexed are metron:metron. Can you check if yours is the same?

avatar
Explorer

@asubramanian,

I forgot to add my indexing configuration to the previous post. Here it goes:

{
   "elasticsearch": {
      "index": "my_telemetry_source",
      "batchSize": 5,
         "enabled" : true
   },
   "hdfs": {
      "index": "my_telemetry_source",
      "batchSize": 5,
      "enabled" : true
   }
}

avatar
Explorer

@asubramanian,

Strange... After restarting everything I'm getting the initial parse error again.

avatar
Explorer

@asubramanian

I can confirm that escaping the braces does not wok. Here's a simplified version of my problem:

Telemetry Input (I'm using NiFi with JoltTransform processor to run this test):

{"hostname":"my.machine","timestamp":"1490746284"} 

Grok Pattern 1:

{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}"}

Grok Pattern 2:

\{"hostname":"%{HOSTNAME:hostname}","timestamp":"%{NUMBER:timestamp}"\}

They both result in the following error in Metron's parser bolt:

java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"my.machine","timestamp":"1490746284"} and the parsed message was: {}

Note that Pattern 1 is a match according to the grok constructor website: http://grokconstructor.appspot.com/

I'm not sure what's going on. Is this a bug in Metron grok parser or am I doing something wrong? I greatly appreciate your help.

avatar
Super Collaborator

Hm.. let me try that on my setup and revert.

avatar
Explorer

I eventually got around this problem by using the CSV parser instead of grok. I was using metron 3.0 at the time and am not sure if the issue with grok got resolved or not, but I don't have any problem with the CSV parser in the current version , 3.1.

avatar
New Contributor

Can you show us please, in steps, how to use CSV parser??

avatar
New Contributor

Hello @H S,

Have you find a solution for Grok parsing Error :

java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was: {"hostname":"my.machine","timestamp":"1490746284"} and the parsed message was: {}

I actually have the same error with a very simple Grok pattern :

PARSE_TXT %{DATA:email}\:%{DATA:name}

For the moment, i just try to parse the following text :

toto:toto
tata:tata
titi:titi

I have tested this on the grok constructor website: http://grokconstructor.appspot.com/ and it correctly match.

Anyone already had this error ?

My parser configuration :

{
 "parserClassName": "org.apache.metron.parsers.GrokParser",
 "sensorTopic": "parse_txt",
 "parserConfig": {
 "grokPath": "/apps/metron/patterns/parse_txt",
 "patternLabel": "PARSE_TXT"
 	}
}

Thanks !

avatar
Explorer

@Sebastien Lehuede:

No, I didn't find the root cause. But in my case, it was fairly easy to use the CSV parser instead of the Grok parser, with slight modifications to my telemetry source. You can find more details about the CSV parser here:

https://metron.apache.org/current-book/metron-platform/metron-parsers/index.html

Hope this helps.

avatar
New Contributor

Hi @H S

Thanks for your answer, i have increased resources on my Full Dev VM and used Metron UI to create new sensor, currently it works.

But i have a new problem, Elasticsearch not index logs. I don't know if logs correctly reach Elasticsearch because in Storm UI, in parserBolt -> Output stats (for my topic), i don't have anythings. (Screen Below).

Has anyone ever had a problem with Kafka Stream ? Where logs correctly pass thought NiFi, correctly receive by Kafka and parse by Grok but are not redirect to Elasticsearch.

bolt-logs.png

avatar
New Contributor

I'm having a similar issue. In my case I'm feeding Yaf data to Kafka (it works correctly). Then it gives the parsing error:

java.lang.IllegalStateException: Grok parser Error: Grok statement produced a null message. Original message was:

I used the Metron's default YAF Grok pattern. I have checked the logs in http://grokconstructor.appspot.com and it is able to correctly match the pattern. Furthermore, I use the "Test" button on Metron's UI to check whether the patter is able to extract the fields from the log and it also works correctly.

Does anyone knows why this fails?

Example Log:

2018-04-04 10:05:32.533|2018-04-04 10:05:32.533| 0.000| 0.000| 6| 10.5.0.83| 2869| 10.5.0.97|52023| AR| 0| 0| 0|c019c454|00000000|000|000| 1| 40| 0| 0| 0|

Yaf Pattern

YAF_TIME_FORMAT %{YEAR:UNWANTED}-%{MONTHNUM:UNWANTED}-%{MONTHDAY:UNWANTED}[T ]%{HOUR:UNWANTED}:%{MINUTE:UNWANTED}:%{SECOND:UNWANTED} 

YAF_DELIMITED %{YAF_TIME_FORMAT:start_time}\|%{YAF_TIME_FORMAT:end_time}\|%{SPACE:UNWANTED}%{BASE10NUM:duration}\|%{SPACE:UNWANTED}%{BASE10NUM:rtt}\|%{SPACE:UNWANTED}%{INT:protocol}\|%{SPACE:UNWANTED}%{IP:ip_src_addr}\|%{SPACE:UNWANTED}%{INT:ip_src_port}\|%{SPACE:UNWANTED}%{IP:ip_dst_addr}\|%{SPACE:UNWANTED}%{INT:ip_dst_port}\|%{SPACE:UNWANTED}%{DATA:iflags}\|%{SPACE:UNWANTED}%{DATA:uflags}\|%{SPACE:UNWANTED}%{DATA:riflags}\|%{SPACE:UNWANTED}%{DATA:ruflags}\|%{SPACE:UNWANTED}%{WORD:isn}\|%{SPACE:UNWANTED}%{DATA:risn}\|%{SPACE:UNWANTED}%{DATA:tag}\|%{GREEDYDATA:rtag}\|%{SPACE:UNWANTED}%{INT:pkt}\|%{SPACE:UNWANTED}%{INT:oct}\|%{SPACE:UNWANTED}%{INT:rpkt}\|%{SPACE:UNWANTED}%{INT:roct}\|%{SPACE:UNWANTED}%{INT:app}\|%{GREEDYDATA:end_reason}

Labels