- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 12-11-2017 03:02 PM - edited 08-17-2019 09:49 AM
Building schemas is tedious work and fraught with errors. The InferAvroSchema processor can get you started. It generates a compliant schema for use. There is one caveat, you have to make sure you are using Apache Avro safe field names. I have a custom processor that will clean your attributes if you need them Avro-safe. See processor listed below.
Example Flow Utilizing InferAvroSchema
InferAvroSchema Details
Step 0: Use Apache NiFi to Convert Data to JSON or CSV
Step 1: Send JSON or CSV Data to InferAvroSchema
I recommend setting output destination to flowfile-attribute, input content type to json, pretty avro output to true.
Step 2: The New schema is now in attribute: inferred.avro.schema.
inferred.avro.schema { "type" : "record", "name" : "schema1", "fields" : [ { "name" : "table", "type" : "string", "doc" : "Type inferred from '\"schema1.tableName\"'" } ] }
This schema can then be used for conversions directly or stored in Hortonworks Schema Registry or Apache NiFi Built-in Avro Registry.
Now you can use it for ConvertRecord, QueryRecord and other Record processing.
Example Generated Schema in Avro-JSON Format Stored in Hortonworks Schema Registry:
Source: https://github.com/tspannhw/nifi-attributecleaner-processor