Created on 08-03-2017 06:16 PM - edited 08-17-2019 11:40 AM
This tutorial walks you through how to install and setup a local Hortonworks Registry to interact with Apache NiFi.
This tutorial was tested using the following environment and components:
Note: The record-oriented processors and controller services used in the demo flow of this tutorial were in introduced in NiFi 1.2.0. As such, the tutorial needs to be done running Version 1.2.0 or later. Currently, Hortonworks Registry 0.2.1 is the version compatible with NiFi 1.3.0.
Download the 0.2.1 Registry release:
hortonworks-registry-0.2.1.tar.gz
Extract the tar:
tar xzvf hortonworks-registry-0.2.1.tar.gz
Login to your MySQL instance and create the schema registry database and necessary users and privileges:
unix> mysql -u root -p unix> Enter password:<enter> mysql> create database schema_registry; mysql> CREATE USER 'registry_user'@'localhost' IDENTIFIED BY 'registry_password'; mysql> GRANT ALL PRIVILEGES ON schema_registry.* TO 'registry_user'@'localhost' WITH GRANT OPTION; mysql> commit;
In the conf directory of the Registry, there is an example MySQL yaml file that we can repurpose:
cd hortonworks-registry-0.2.1 cp conf/registry.yaml.mysql.example conf/registry.yaml
Edit the following section in the yaml file to add appropriate database and user settings:
storageProviderConfiguration: providerClass: "com.hortonworks.registries.storage.impl.jdbc.JdbcStorageManager" properties: db.type: "mysql" queryTimeoutInSecs: 30 db.properties: dataSourceClassName: "org.mariadb.jdbc.MariaDbDataSource" dataSource.url: "jdbc:mysql://localhost/schema_registry" dataSource.user: "registry_user" dataSource.password: "registry_password"
Note: For my environment (with MySQL installed via Homebrew), I did not need to change these default values.
./bootstrap/bootstrap-storage.sh
./bin/registry-server-start.sh ./conf/registry.yaml
Navigate to the registry UI in your browser:
Select the "+" button to add a schema to the registry:
Configure the schema as follows:
The schema text is:
{ "type": "record", "name": "UserRecord", "fields" : [ {"name": "id", "type": "long"}, {"name": "title", "type": ["null", "string"]}, {"name": "first", "type": ["null", "string"]}, {"name": "last", "type": ["null", "string"]}, {"name": "street", "type": ["null", "string"]}, {"name": "city", "type": ["null", "string"]}, {"name": "state", "type": ["null", "string"]}, {"name": "zip", "type": ["null", "string"]}, {"name": "gender", "type": ["null", "string"]}, {"name": "email", "type": ["null", "string"]}, {"name": "username", "type": ["null", "string"]}, {"name": "password", "type": ["null", "string"]}, {"name": "phone", "type": ["null", "string"]}, {"name": "cell", "type": ["null", "string"]}, {"name": "ssn", "type": ["null", "string"]}, {"name": "date_of_birth", "type": ["null", "string"]}, {"name": "reg_date", "type": ["null", "string"]}, {"name": "large", "type": ["null", "string"]}, {"name": "medium", "type": ["null", "string"]}, {"name": "thumbnail", "type": ["null", "string"]}, {"name": "version", "type": ["null", "string"]}, {"name": "nationality", "type": ["null", "string"]} ] }
Save the schema:
The flow we are going to use for this tutorial is the same one used in the article Convert CSV to JSON, Avro, XML using ConvertRecord. However, we are going to modify the flow to use a schema in our local Hortonworks Registry instead of a local Avro Schema Registry.
The template can be downloaded here: convert-csv-to-json.xml
The CSV file used by the flow can be downloaded here: users.txt
Note: Change the extension from .txt to .csv after downloading.
Create two local directories. One input directory and one for the JSON output. Place the "users.csv" file in the input directory.
Start NiFi. Import the provided template and add it to the canvas:
Change the Input Directory path in the GetFile processor to point to your local input directory:
Change the Directory path in the PutFile processor to point to your local output directory:
Now all that remains to run the flow is to modify the schema registry that is used by the record reader and writer controller services. The template is configured to use a local AvroSchemaRegistry controller service. We will change it to use the HortonworksSchemaRegistry.
Select the root process group "NiFi Flow" by clicking an empty area of the canvas. Select the gear icon from the Operate Palette:
This opens the NiFi Flow Configuration window. Select the Controller Services tab and click the "+" button to create a new controller service.
Select HortonworksSchemaRegistry from the list and click "Add":
Select the Edit button ("pencil" icon) next to the HortonworksSchemaRegistry controller service. Configure it to point to the local Hortonworks Schema Registry instance by adding http://localhost:9090/api/v1 as the value for the "Schema Registry URL" property:
Select the Edit button ("pencil" icon) next to the CSVReader controller service. Change the "Schema Registry" property value from AvroSchemaRegistry to now point to HortonworksSchemaRegistry:
Select the Edit button ("pencil" icon) next to the JsonRecordSetWriter controller service. Change the "Schema Registry" property value from AvroSchemaRegistry to now point to HortonworksSchemaRegistry:
Enable HortonworksSchemaRegistry controller service by selecting the lightning bolt icon. This will then allow you to enable the CSVReader and JSONRecordSetWriter controller services. Select the lightning bolt icons for both of these services. All the necessary controller services should be enabled at this point:
Note: The AvroSchemaRegistry controller service is no longer used by the flow and can remain disabled.
The flow can now be started:
When run successfully, the JSON formatted file is placed in the local directory we specified earlier in the PutFile processor:
To learn more about the flow with more detailed explanations of the record-oriented processors and controller services in NiFi, see Convert CSV to JSON, Avro, XML using ConvertRecord.
Created on 03-30-2021 12:51 PM
Thanks for the great article. Can you update the steps needed to setup SSL authentication for the schema registry ?
Created on 09-21-2021 10:46 AM - edited 09-21-2021 10:46 AM
@rajeshfss - Refer to the Registry docs -- Running registry and streamline web-services securely.