Support Questions

Find answers, ask questions, and share your expertise

Nifi JsonRecordSetWriter help, timestamp field manipulation

avatar
New Contributor

I'm trying to use a ListenUDPRecord processor to parse syslog messages as input, using SyslogReader as Record Reader and JsonRecordSetWriter as Record Writer.

The solution is working as I am getting a json message as output with the following fields: priority, severity, facility, version, timestamp, hostname, body.

The json must then be indexed in a solr collection and the problem is that I get a timestamp field of the form: Nov 16 12:36:32 but I would need a unix timestamp fomat field (e.g. 1637062592000) or an output like that: "2021-11-16 12:36:32".

I tried to specify the "Timestamp Format" field of the JsonRecordSetWriter service (i.e. "yyyy-MM-dd HH:mm:ss" or "MM/dd/yyyy HH:mm:ss") but output does not change.

How can I change the structure of the timestamp field of my output json message?


Screenshot 2021-11-16 130326.pngScreenshot 2021-11-16 130258.png
1 ACCEPTED SOLUTION

avatar
Master Mentor

@prova 

 

Based on timestamp shared, the source is RFC3164 syslog messages in which the timestamp does not include a year.
The SyslogReader supports both RFC3164 and RFC5424 syslog messages, but uses a generic syslog schema applied against the source data:
				{
				  "type" : "record",
				  "name" : "nifiRecord",
				  "namespace" : "org.apache.nifi",
				  "fields" : [ {
					"name" : "priority",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "severity",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "facility",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "version",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "timestamp",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "hostname",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "body",
					"type" : [ "null", "string" ]
				  } ]
				}
You can see that timestamp is treated as a string.

When it comes to reformatting the customer is looking for, where is NiFi expected to extract the year from since int is not in the syslog message?

Since schema treats the timestamp as a string, it can't be treated like a timestamp type within the syslog for reformatting.This is possible with RFC5424 formatted source syslog messages.

This is not to say that you could not manipulate this date string via some downstream processor, but would still need to figure out where you are going to get the year from.  NiFi can't assume that RFC3164 formatted syslog message was produced in same year that NiFi is parsing it.  This becomes hard to handle evening via some downstream processor at end of year where NiFi servers may already be in 2022 for example but received RFC3164 syslog messages were produced in 2021.

RFC3164 was absolute when RFC5424 was introduced.  RFC3164 syslog messages are produced by older systems and the options here are limited.


If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.

Thank you,

Matt

View solution in original post

1 REPLY 1

avatar
Master Mentor

@prova 

 

Based on timestamp shared, the source is RFC3164 syslog messages in which the timestamp does not include a year.
The SyslogReader supports both RFC3164 and RFC5424 syslog messages, but uses a generic syslog schema applied against the source data:
				{
				  "type" : "record",
				  "name" : "nifiRecord",
				  "namespace" : "org.apache.nifi",
				  "fields" : [ {
					"name" : "priority",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "severity",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "facility",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "version",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "timestamp",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "hostname",
					"type" : [ "null", "string" ]
				  }, {
					"name" : "body",
					"type" : [ "null", "string" ]
				  } ]
				}
You can see that timestamp is treated as a string.

When it comes to reformatting the customer is looking for, where is NiFi expected to extract the year from since int is not in the syslog message?

Since schema treats the timestamp as a string, it can't be treated like a timestamp type within the syslog for reformatting.This is possible with RFC5424 formatted source syslog messages.

This is not to say that you could not manipulate this date string via some downstream processor, but would still need to figure out where you are going to get the year from.  NiFi can't assume that RFC3164 formatted syslog message was produced in same year that NiFi is parsing it.  This becomes hard to handle evening via some downstream processor at end of year where NiFi servers may already be in 2022 for example but received RFC3164 syslog messages were produced in 2021.

RFC3164 was absolute when RFC5424 was introduced.  RFC3164 syslog messages are produced by older systems and the options here are limited.


If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post.

Thank you,

Matt