Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Grok parser creating timestamp by combining multiple fields

Highlighted

Grok parser creating timestamp by combining multiple fields

Contributor

I have an ambari managed cluster with hdp version 2.5.3.0. I have to create timestamp in unix timestamp from my DNS log which has time information spread across multiple fields. ( Eg log record : Nov 29 09:07:56 host_01456 named[5987]: client 192.168.209.43#443 query: linkedin.com IN A +ED )

As per my understanding The default Grok parser come with metron expects time information in single field for it to be processed. When checked the code it formats every json values in keys specified in timeFields in parserConfig configuration object using `dateformat` field in parserConfig. This creates some invalid timestamps. How can I tell in grok statement to combine values from different words in to one and use that as the timeField value ? Is there simple solution using grok itself other than stellar functions ?

Eg Log record :

Grok statement I am using right now : DNSLOG %{SYSLOGTIMESTAMP:timestamp} %{DATA:agent} %{DATA:protocol}\[%{POSINT:rqstPort}\]: %{DATA:client} %{IP:srcIp}#%{POSINT:srcPort} %{DATA}: %{DATA:domain} %{DATA:namespace} %{GREEDYDATA:typeName}

Parser Configuration :

"parserConfig": { "grokPath": "/apps/metron/patterns/dnslog", "patternLabel": "DNSLOG", "timeFields": [ "timestamp" ], "timestampField": "timestamp", "dateFormat": "MMM dd HH:MM:SS" }

Don't have an account?
Coming from Hortonworks? Activate your account here