Created on 06-27-2019 02:17 PM - edited 09-16-2022 01:45 AM
Let's jump into tutorial 2 from my AI to Edge series!
This tutorial details the creation of a Nifi flow executing the ONNX model we trained in my last article.
More precisely we will try to feed these 3 handwritten digits and predict their value:
Note: as always, all code/files referenced in this tutorial can be found on my github, here.
Below is an overview of the flow:
As you can see, the flow is divided in the following sections:
This will be useful when we deploy the flow to a minifi flow. Go to your variables and create the following:
root_folder
location of your download of my github
Create a ListFiles processor and modify the following properties:
${root_folder}NIFI/png/original/
[^\.].*.png
Create a FetchFiles processor with default parameters.
Note: The List/Fetch paradigm is very powerful because it will allow us to continuously look for new images without reprocessing all of them. ListFiles is a stateful processor. If you're unfamiliar with the concept I encourage you to read about it on this community.
Create a ResizeImage processor and modify the following properties:
28
28
Create an UpdateAttribute processor, aimed at defining the folder and filename of the resized images, by adding the following properties to the processor:
${root_folder}NIFI/png/resized/
resized_${filename}
Create a PutFile processor and modify the following properties to store the converted image in the resized folder:
${filedirectory}
In this step we will create an ExecuteStreamCommand processor that will run the convertImg.sh
python script. The script takes the resized image file, converts it to grayscale, and converts it into an inverted CSV to match the input of our model. Below is the script itself:
#!/usr/bin/env python3 import os,png,array import pandas as pd import time import sys from PIL import Image from PIL import ImageOps columnNames = list() for i in range(784): pixel = 'pixel' pixel += str(i) columnNames.append(pixel) train_data = pd.DataFrame(columns = columnNames) start_time = time.time() img_name = sys.argv[1] img = Image.open(img_name) img = img.convert('LA') rawData = img.load() data = [] for y in range(28): for x in range(28): data.append(rawData[x,y][0]) print(i) k = 0 #print data train_data.loc[i] = [255-data[k] for k in range(784)] csvFile = sys.argv[2] print(csvFile) train_data.to_csv(csvFile,index = False)
As you can see it expects two arguments:
img_name = sys.argv[1]
)csvFile = sys.argv[2]
)Thus, you will modify the following properties in the ExecuteStreamCommand processor:
${root_folder}NIFI/png/resized/${filename};${root_folder}NIFI/csv/${filename}.csv
${root_folder}NIFI/convertImg.sh
Create an UpdateAttribute processor, aimed at defining the locations of the CSV file and the ONNX model, by adding the following properties to the processor:
${root_folder}NIFI/csv/${filename}.csv
${root_folder}NOTEBOOKS/model.onnx
In this step we will create an ExecuteStreamCommand processor that will run the runModel.sh
python script. The script takes the CSV version of the image and run the ONNX model created in the last tutorial with this CSV as an input. Below is the script itself:
#!/usr/bin/env python3 import onnxruntime as rt import onnx as ox import numpy import pandas as pd import shutil import sys test=pd.read_csv(sys.argv[1]) X_test = test.values.astype('float32') X_test = X_test.reshape(X_test.shape[0], 28, 28,1) session = rt.InferenceSession(sys.argv[2]) input_name = session.get_inputs()[0].name label_name = session.get_outputs()[0].name prediction = session.run([label_name], {input_name: X_test.astype(numpy.float32)})[0] number = 0 for i in range(0, 9): if (prediction[0][i] == 1.0): number = i print(number)
As you can see it expects two arguments:
test=pd.read_csv(sys.argv[1])
)session = rt.InferenceSession(sys.argv[2])
)Thus, you will modify the following properties in the ExecuteStreamCommand processor:
${filename};${onnxModel}
${root_folder}NIFI/runModel.sh
If you run the flow against the image in my github, you will see 3 output flowfiles, predicting the value of the handwritten digit, like shown below: