Support Questions
Find answers, ask questions, and share your expertise

JsonRecordSetWriter not using date format field on inherited schema

New Contributor

I'm currently trying to put together a flow which takes a series of Excel spreadsheets containing records (one per line) and converting them to JSON for use with an API. I'm having some problems with UK/US date formatting and despite following https://community.cloudera.com/t5/Support-Questions/Use-NiFi-to-change-the-format-of-numeric-date-an... the JsonRecordSetWriter appears to ignore the date formatting.

 

The Excel files are in a series of different formats - the specific format isn't that important, but each column has a JSON label as the header and one or more rows of data. For example:

aledt_0-1631883987638.png

 

The output JSON must match the input file - I can't define a fixed schema as each input file may have different schema.

 

I have a flow that looks something like this:

aledt_1-1631884026711.png

(the flow goes on to do other things after adding the submitted_at field).

 

The ConvertRecord processor is fairly simple and has a CSV reader and a JSON writer. Now, we're seeing the CSV created consistently with US date format (MM/dd/yy) but the API needs it as a UK date format (dd/MM/yyyy).

 

The CSVReader config:

aledt_2-1631884220783.png

 

Note that we're inferring the schema from CSV and defining the date format so that Nifi recognises the format.

 

For the writer:

aledt_3-1631884447253.png

 

This is inheriting the schema - we don't know ahead of time what the schema will be so I can't use a fixed schema. For debugging, I've enabled writing the schema to an attribute so that I could see what it was doing. Note the date format here is in UK format.

 

When I run this, however, it continues to output the date in US format. 

 

 

[
	{
		"first_name":"John",
		"last_name":"Smith",
		"date_of_birth":"11/28/1970"
	}
]

 

 

The generated schema suggess that it's picking up the date format correctly:

 

{"name":"first_name","type":["null","string"]},
{"name":"last_name","type":["null","string"]},
{"name":"date_of_birth","type":["null",{"type":"int","logicalType":"date"}]}

 

 

Now, running a test file through with the schema recorded to the avro.schema attribute and defining the schema, the processor worked and converted the date. However, because the input file attributes vary, I can't define a fixed schema and need to infer the schema from the CSV each time.

 

Am I missing something in how I've configured it? Or have I stumbled accross a defect?

 

Thanks

Aled.

0 REPLIES 0