Created on 11-07-201708:03 PM - edited 08-17-201910:13 AM
Objective
This is the second of a two article series on the ValidateRecord processor. The first walks you through a NiFI flow that converts a CVS file into JSON format and validates the data against a given schema.
This article discusses the effects of enabling/disabling the "Strict Type Checking" property of the ValidateRecord processor.
Note: The ValidateRecord processor was introduced in NiFi 1.4.0.
Environment
This tutorial was tested using the following environment and components:
Mac OS X 10.11.6
Apache NiFi 1.4.0
Strict Type Checking Property
A useful property of the ValidateRecord processor is "Strict Type Checking". If the incoming data has a Record where a field is not of the correct type, this property determines how to handle the Record. If set to "true", the Record will be considered invalid. If set to "false", the Record will be considered valid.
To demonstrate both cases, we need to ingest data that can distinguish between different types (which our CSV data from the first article could not). Let's grab a snippet of the JSON candy data and make some changes. Specifically let's put a string value for the "chocolate" field (which is of type int) and let's put a decimal value for the "competitorname" field (which is of type string😞
Here is the JSON file: type-checking.txt (Change the extension from .txt to .json after downloading)
Place the type-checking.json file in your input directory:
In order to process the JSON file, the ValidateRecord processor needs to use a JSON Record Reader. Go to the configuration window for the processor and select "Create new service..." for the Record Reader:
Select JSONTreeReader, then "Create":
and then select the Arrow icon next to the reader:
Save the changes made before going to the Controller Service.
Go to the configuration window of the JsonTreeReader controller service, select "AvroSchemaRegistry" for the Schema Registy and then select Apply:
Enable the JsonTreeReader service. The flow is ready to run.
Start the GetFile, UpdateAtttribute and ValidateRecord processors. With "Strict Type Checking" set to "true", the 2 records are considered invalid and are routed to that connection:
Start the LogAttribute processor to clear the queue. Stop all processors. Place the type-checking.json file in your input directory again.
Now let's change the Strict Type Checking property to "false":
Running the flow this time, the 2 records are considered valid and are routed to that connection:
Note: The documentation for the Strict Type Checking property states that when set to false, the relevant record fields will be coerced into the correct type. This functionality is currently broken (see NIFI-4579).