- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on ‎08-03-2017 06:16 PM - edited ‎08-17-2019 11:40 AM
Objective
This tutorial walks you through how to install and setup a local Hortonworks Registry to interact with Apache NiFi.
Environment
This tutorial was tested using the following environment and components:
- Mac OS X 10.11.6
- MySQL 5.7.13
- Apache NiFi 1.3.0
- Hortonworks Registry 0.2.1
Note: The record-oriented processors and controller services used in the demo flow of this tutorial were in introduced in NiFi 1.2.0. As such, the tutorial needs to be done running Version 1.2.0 or later. Currently, Hortonworks Registry 0.2.1 is the version compatible with NiFi 1.3.0.
Environment Configuration
Hortonworks Registry Installation
Download the 0.2.1 Registry release:
hortonworks-registry-0.2.1.tar.gz
Extract the tar:
tar xzvf hortonworks-registry-0.2.1.tar.gz
MySQL Database Setup
Login to your MySQL instance and create the schema registry database and necessary users and privileges:
unix> mysql -u root -p unix> Enter password:<enter> mysql> create database schema_registry; mysql> CREATE USER 'registry_user'@'localhost' IDENTIFIED BY 'registry_password'; mysql> GRANT ALL PRIVILEGES ON schema_registry.* TO 'registry_user'@'localhost' WITH GRANT OPTION; mysql> commit;
Configure registry.yaml
In the conf directory of the Registry, there is an example MySQL yaml file that we can repurpose:
cd hortonworks-registry-0.2.1 cp conf/registry.yaml.mysql.example conf/registry.yaml
Edit the following section in the yaml file to add appropriate database and user settings:
storageProviderConfiguration: providerClass: "com.hortonworks.registries.storage.impl.jdbc.JdbcStorageManager" properties: db.type: "mysql" queryTimeoutInSecs: 30 db.properties: dataSourceClassName: "org.mariadb.jdbc.MariaDbDataSource" dataSource.url: "jdbc:mysql://localhost/schema_registry" dataSource.user: "registry_user" dataSource.password: "registry_password"
Note: For my environment (with MySQL installed via Homebrew), I did not need to change these default values.
Run Bootstrap Scripts
./bootstrap/bootstrap-storage.sh
Start the Registry Server
./bin/registry-server-start.sh ./conf/registry.yaml
Open Registry UI
Navigate to the registry UI in your browser:
Schema Creation
Select the "+" button to add a schema to the registry:
Configure the schema as follows:
The schema text is:
{ "type": "record", "name": "UserRecord", "fields" : [ {"name": "id", "type": "long"}, {"name": "title", "type": ["null", "string"]}, {"name": "first", "type": ["null", "string"]}, {"name": "last", "type": ["null", "string"]}, {"name": "street", "type": ["null", "string"]}, {"name": "city", "type": ["null", "string"]}, {"name": "state", "type": ["null", "string"]}, {"name": "zip", "type": ["null", "string"]}, {"name": "gender", "type": ["null", "string"]}, {"name": "email", "type": ["null", "string"]}, {"name": "username", "type": ["null", "string"]}, {"name": "password", "type": ["null", "string"]}, {"name": "phone", "type": ["null", "string"]}, {"name": "cell", "type": ["null", "string"]}, {"name": "ssn", "type": ["null", "string"]}, {"name": "date_of_birth", "type": ["null", "string"]}, {"name": "reg_date", "type": ["null", "string"]}, {"name": "large", "type": ["null", "string"]}, {"name": "medium", "type": ["null", "string"]}, {"name": "thumbnail", "type": ["null", "string"]}, {"name": "version", "type": ["null", "string"]}, {"name": "nationality", "type": ["null", "string"]} ] }
Save the schema:
NiFi Configuration
NiFi Template & CSV File
The flow we are going to use for this tutorial is the same one used in the article Convert CSV to JSON, Avro, XML using ConvertRecord. However, we are going to modify the flow to use a schema in our local Hortonworks Registry instead of a local Avro Schema Registry.
The template can be downloaded here: convert-csv-to-json.xml
The CSV file used by the flow can be downloaded here: users.txt
Note: Change the extension from .txt to .csv after downloading.
NiFi Flow Configuration
Input and Output
Create two local directories. One input directory and one for the JSON output. Place the "users.csv" file in the input directory.
Import Template
Start NiFi. Import the provided template and add it to the canvas:
Update Directory Paths in GetFile and PutFile Processors
Change the Input Directory path in the GetFile processor to point to your local input directory:
Change the Directory path in the PutFile processor to point to your local output directory:
Edit and Enable Controller Services
Now all that remains to run the flow is to modify the schema registry that is used by the record reader and writer controller services. The template is configured to use a local AvroSchemaRegistry controller service. We will change it to use the HortonworksSchemaRegistry.
Select the root process group "NiFi Flow" by clicking an empty area of the canvas. Select the gear icon from the Operate Palette:
This opens the NiFi Flow Configuration window. Select the Controller Services tab and click the "+" button to create a new controller service.
Select HortonworksSchemaRegistry from the list and click "Add":
Select the Edit button ("pencil" icon) next to the HortonworksSchemaRegistry controller service. Configure it to point to the local Hortonworks Schema Registry instance by adding http://localhost:9090/api/v1 as the value for the "Schema Registry URL" property:
Select the Edit button ("pencil" icon) next to the CSVReader controller service. Change the "Schema Registry" property value from AvroSchemaRegistry to now point to HortonworksSchemaRegistry:
Select the Edit button ("pencil" icon) next to the JsonRecordSetWriter controller service. Change the "Schema Registry" property value from AvroSchemaRegistry to now point to HortonworksSchemaRegistry:
Enable HortonworksSchemaRegistry controller service by selecting the lightning bolt icon. This will then allow you to enable the CSVReader and JSONRecordSetWriter controller services. Select the lightning bolt icons for both of these services. All the necessary controller services should be enabled at this point:
Note: The AvroSchemaRegistry controller service is no longer used by the flow and can remain disabled.
Run the Flow
The flow can now be started:
When run successfully, the JSON formatted file is placed in the local directory we specified earlier in the PutFile processor:
To learn more about the flow with more detailed explanations of the record-oriented processors and controller services in NiFi, see Convert CSV to JSON, Avro, XML using ConvertRecord.
Created on ‎03-30-2021 12:51 PM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
Thanks for the great article. Can you update the steps needed to setup SSL authentication for the schema registry ?
Created on ‎09-21-2021 10:46 AM - edited ‎09-21-2021 10:46 AM
- Mark as Read
- Mark as New
- Bookmark
- Permalink
- Report Inappropriate Content
@rajeshfss - Refer to the Registry docs -- Running registry and streamline web-services securely.