- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
Created on 06-27-2019 02:17 PM - edited 09-16-2022 01:45 AM
Introduction
Let's jump into tutorial 2 from my AI to Edge series!
This tutorial details the creation of a Nifi flow executing the ONNX model we trained in my last article.
More precisely we will try to feed these 3 handwritten digits and predict their value:
Note: as always, all code/files referenced in this tutorial can be found on my github, here.
Agenda
Below is an overview of the flow:
As you can see, the flow is divided in the following sections:
- Section 1: Listening to a folder for new png files
- Section 2: Resizing these images to 28x28 (size used to train our model)
- Section 3: Converting these images to CSV (format used to train our model)
- Section 4: Running our predictive model
Section 1: Listening to a folder for new png files
Step 1: Setup a variable for the root folder
This will be useful when we deploy the flow to a minifi flow. Go to your variables and create the following:
- Name:
root_folder
- Value:
location of your download of my github
Step 2: List files in folder
Create a ListFiles processor and modify the following properties:
- Input Directory:
${root_folder}NIFI/png/original/
- File Filter:
[^\.].*.png
Step 3: Fetch files in folder
Create a FetchFiles processor with default parameters.
Note: The List/Fetch paradigm is very powerful because it will allow us to continuously look for new images without reprocessing all of them. ListFiles is a stateful processor. If you're unfamiliar with the concept I encourage you to read about it on this community.
Section 2: Resizing these images to 28x28
Step 1: Resize Image
Create a ResizeImage processor and modify the following properties:
- Image Width (in pixels):
28
- Image Height (in pixels):
28
Step 2: Enter out attributes for resized images
Create an UpdateAttribute processor, aimed at defining the folder and filename of the resized images, by adding the following properties to the processor:
- filedirectory:
${root_folder}NIFI/png/resized/
- filename:
resized_${filename}
Section 3: Converting these images to CSV
Step 1: Saving modified image
Create a PutFile processor and modify the following properties to store the converted image in the resized folder:
- Directory:
${filedirectory}
Step 2: Execute a python script to convert images to CSV
In this step we will create an ExecuteStreamCommand processor that will run the convertImg.sh
python script. The script takes the resized image file, converts it to grayscale, and converts it into an inverted CSV to match the input of our model. Below is the script itself:
#!/usr/bin/env python3 import os,png,array import pandas as pd import time import sys from PIL import Image from PIL import ImageOps columnNames = list() for i in range(784): pixel = 'pixel' pixel += str(i) columnNames.append(pixel) train_data = pd.DataFrame(columns = columnNames) start_time = time.time() img_name = sys.argv[1] img = Image.open(img_name) img = img.convert('LA') rawData = img.load() data = [] for y in range(28): for x in range(28): data.append(rawData[x,y][0]) print(i) k = 0 #print data train_data.loc[i] = [255-data[k] for k in range(784)] csvFile = sys.argv[2] print(csvFile) train_data.to_csv(csvFile,index = False)
As you can see it expects two arguments:
- Location of the resized image (
img_name = sys.argv[1]
) - Location of the target CSV (
csvFile = sys.argv[2]
)
Thus, you will modify the following properties in the ExecuteStreamCommand processor:
- Command Arguments:
${root_folder}NIFI/png/resized/${filename};${root_folder}NIFI/csv/${filename}.csv
- Command Path:
${root_folder}NIFI/convertImg.sh
Section 4: Running our predictive model
Step 1: Enter input attributes for model execution
Create an UpdateAttribute processor, aimed at defining the locations of the CSV file and the ONNX model, by adding the following properties to the processor:
- filename:
${root_folder}NIFI/csv/${filename}.csv
- onnxModel:
${root_folder}NOTEBOOKS/model.onnx
Step 2: Use python to run the model with onnxruntime
In this step we will create an ExecuteStreamCommand processor that will run the runModel.sh
python script. The script takes the CSV version of the image and run the ONNX model created in the last tutorial with this CSV as an input. Below is the script itself:
#!/usr/bin/env python3 import onnxruntime as rt import onnx as ox import numpy import pandas as pd import shutil import sys test=pd.read_csv(sys.argv[1]) X_test = test.values.astype('float32') X_test = X_test.reshape(X_test.shape[0], 28, 28,1) session = rt.InferenceSession(sys.argv[2]) input_name = session.get_inputs()[0].name label_name = session.get_outputs()[0].name prediction = session.run([label_name], {input_name: X_test.astype(numpy.float32)})[0] number = 0 for i in range(0, 9): if (prediction[0][i] == 1.0): number = i print(number)
As you can see it expects two arguments:
- Location of the CSV (
test=pd.read_csv(sys.argv[1])
) - Location of the ONNX model (
session = rt.InferenceSession(sys.argv[2])
)
Thus, you will modify the following properties in the ExecuteStreamCommand processor:
- Command Arguments:
${filename};${onnxModel}
- Command Path:
${root_folder}NIFI/runModel.sh
Results
If you run the flow against the image in my github, you will see 3 output flowfiles, predicting the value of the handwritten digit, like shown below: