Community Articles
Find and share helpful community-sourced technical articles
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.
Labels (1)
Guru

Objective

This is the second of a two article series on the ValidateRecord processor. The first walks you through a NiFI flow that converts a CVS file into JSON format and validates the data against a given schema.

This article discusses the effects of enabling/disabling the "Strict Type Checking" property of the ValidateRecord processor.

Note: The ValidateRecord processor was introduced in NiFi 1.4.0.

Environment

This tutorial was tested using the following environment and components:

  • Mac OS X 10.11.6
  • Apache NiFi 1.4.0

Strict Type Checking Property

A useful property of the ValidateRecord processor is "Strict Type Checking". If the incoming data has a Record where a field is not of the correct type, this property determines how to handle the Record. If set to "true", the Record will be considered invalid. If set to "false", the Record will be considered valid.

To demonstrate both cases, we need to ingest data that can distinguish between different types (which our CSV data from the first article could not). Let's grab a snippet of the JSON candy data and make some changes. Specifically let's put a string value for the "chocolate" field (which is of type int) and let's put a decimal value for the "competitorname" field (which is of type string

[ {
  "competitorname" : "One dime",
  "chocolate" : "0",
  "fruity" : 0,
  "caramel" : 0,
  "peanutyalmondy" : 0,
  "nougat" : 0,
  "crispedricewafer" : 0,
  "hard" : 0,
  "bar" : 0,
  "pluribus" : 0,
  "sugarpercent" : 0.011,
  "pricepercent" : 0.116,
  "winpercent" : 32.261086
  }, {
  "competitorname" : 3.14159,
  "chocolate" : 1,
  "fruity" : 0,
  "caramel" : 0,
  "peanutyalmondy" : 0,
  "nougat" : 0,
  "crispedricewafer" : 1,
  "hard" : 0,
  "bar" : 0,
  "pluribus" : 1,
  "sugarpercent" : 0.87199998,
  "pricepercent" : 0.84799999,
  "winpercent" : 49.524113
} ]

Here is the JSON file: type-checking.txt (Change the extension from .txt to .json after downloading)

Place the type-checking.json file in your input directory:

43491-22-input-directory.png

In order to process the JSON file, the ValidateRecord processor needs to use a JSON Record Reader. Go to the configuration window for the processor and select "Create new service..." for the Record Reader:

43492-23-create-new-service.png

Select JSONTreeReader, then "Create":

43493-24-add-jsontreereader.png

and then select the Arrow icon next to the reader:

43494-25-goto-jsontreereader.png

Save the changes made before going to the Controller Service.

Go to the configuration window of the JsonTreeReader controller service, select "AvroSchemaRegistry" for the Schema Registy and then select Apply:

43496-26-jsontreereader-properties.png

Enable the JsonTreeReader service. The flow is ready to run.

Start the GetFile, UpdateAtttribute and ValidateRecord processors. With "Strict Type Checking" set to "true", the 2 records are considered invalid and are routed to that connection:

43497-27-stricttypechecking-invalid.png

43498-28-stricttypechecking-invalid-details.png

Start the LogAttribute processor to clear the queue. Stop all processors. Place the type-checking.json file in your input directory again.

Now let's change the Strict Type Checking property to "false":

43500-29-stricttypechecking-false.png

Running the flow this time, the 2 records are considered valid and are routed to that connection:

43501-30-stricttypechecking-valid.png

Note: The documentation for the Strict Type Checking property states that when set to false, the relevant record fields will be coerced into the correct type. This functionality is currently broken (see NIFI-4579).

1,357 Views
Don't have an account?
Coming from Hortonworks? Activate your account here
Version history
Revision #:
2 of 2
Last update:
‎08-17-2019 10:13 AM
Updated by:
 
Contributors
Top Kudoed Authors