Community Articles
Find and share helpful community-sourced technical articles
Labels (1)
Super Guru

Building schemas is tedious work and fraught with errors. The InferAvroSchema processor can get you started. It generates a compliant schema for use. There is one caveat, you have to make sure you are using Apache Avro safe field names. I have a custom processor that will clean your attributes if you need them Avro-safe. See processor listed below.

Example Flow Utilizing InferAvroSchema


InferAvroSchema Details


Step 0: Use Apache NiFi to Convert Data to JSON or CSV

Step 1: Send JSON or CSV Data to InferAvroSchema

I recommend setting output destination to flowfile-attribute, input content type to json, pretty avro output to true.

Step 2: The New schema is now in attribute: inferred.avro.schema.

{ "type" : "record", "name" : "schema1", "fields" : [ { "name" : "table", "type" : "string", "doc" : "Type inferred from '\"schema1.tableName\"'" } ] } 

This schema can then be used for conversions directly or stored in Hortonworks Schema Registry or Apache NiFi Built-in Avro Registry.

Now you can use it for ConvertRecord, QueryRecord and other Record processing.

Example Generated Schema in Avro-JSON Format Stored in Hortonworks Schema Registry: