Created on 03-11-201801:43 AM - edited 08-17-201908:27 AM
Big Data DevOps: Part 2: Schemas, Schemas, Schemas. Know Your Records, Know Your DataTypes, Know Your Fields, Know Your Data.
Since we can do process records in Apache NiFi, Streaming Analytics Manager, Apache Kafka and any tool that can work with a schema, we have a real need to use a Schema Registry. I have mentioned them before. One thing that is important is to be able to automate the management of schemas. Today we will be listing and exporting them for backup and migration purposes. We will also cover how to upload new schemas and version of schemas.
The steps to backup schemas with Apache NiFi 1.5+ is easy.
Backup All Schemas
GetHTTP: Get the List of Schemas for SR via GET
SplitJson to turn list into individual records
EvaluateJsonPath: get the schema name.
InvokeHTTP: get the schema body
EvaluateJsonPath: turn the schema text into a separate flow file
Rename and save both the full JSON record from the registry and the schema only.
NiFi Flow
Initial Call to List All Schemas
Get The Schema Name
Example Schema with Text
An Example of JSON Schema Text
Build a New Flow File from The Schema Text JSON
Get the Latest Version of the Schema Text For this Schema By Name
If you wish you can use the Confluent style API against SR and against Confluent Schema Registry. it is slighty different, but easy to change our REST calls to process this.