Created on 06-27-2019 02:17 PM - edited 09-16-2022 01:45 AM
Let's jump into tutorial 2 from my AI to Edge series!
This tutorial details the creation of a Nifi flow executing the ONNX model we trained in my last article.
More precisely we will try to feed these 3 handwritten digits and predict their value:
Note: as always, all code/files referenced in this tutorial can be found on my github, here.
Below is an overview of the flow:
As you can see, the flow is divided in the following sections:
This will be useful when we deploy the flow to a minifi flow. Go to your variables and create the following:
root_folderlocation of your download of my githubCreate a ListFiles processor and modify the following properties:
${root_folder}NIFI/png/original/[^\.].*.pngCreate a FetchFiles processor with default parameters.
Note: The List/Fetch paradigm is very powerful because it will allow us to continuously look for new images without reprocessing all of them. ListFiles is a stateful processor. If you're unfamiliar with the concept I encourage you to read about it on this community.
Create a ResizeImage processor and modify the following properties:
2828Create an UpdateAttribute processor, aimed at defining the folder and filename of the resized images, by adding the following properties to the processor:
${root_folder}NIFI/png/resized/resized_${filename}Create a PutFile processor and modify the following properties to store the converted image in the resized folder:
${filedirectory}In this step we will create an ExecuteStreamCommand processor that will run the convertImg.sh python script. The script takes the resized image file, converts it to grayscale, and converts it into an inverted CSV to match the input of our model. Below is the script itself:
#!/usr/bin/env python3
import os,png,array
import pandas as pd
import time
import sys
from PIL import Image
from PIL import ImageOps
columnNames = list()
for i in range(784):
pixel = 'pixel'
pixel += str(i)
columnNames.append(pixel)
train_data = pd.DataFrame(columns = columnNames)
start_time = time.time()
img_name = sys.argv[1]
img = Image.open(img_name)
img = img.convert('LA')
rawData = img.load()
data = []
for y in range(28):
for x in range(28):
data.append(rawData[x,y][0])
print(i)
k = 0
#print data
train_data.loc[i] = [255-data[k] for k in range(784)]
csvFile = sys.argv[2]
print(csvFile)
train_data.to_csv(csvFile,index = False)As you can see it expects two arguments:
img_name = sys.argv[1])csvFile = sys.argv[2])Thus, you will modify the following properties in the ExecuteStreamCommand processor:
${root_folder}NIFI/png/resized/${filename};${root_folder}NIFI/csv/${filename}.csv${root_folder}NIFI/convertImg.shCreate an UpdateAttribute processor, aimed at defining the locations of the CSV file and the ONNX model, by adding the following properties to the processor:
${root_folder}NIFI/csv/${filename}.csv${root_folder}NOTEBOOKS/model.onnxIn this step we will create an ExecuteStreamCommand processor that will run the runModel.sh python script. The script takes the CSV version of the image and run the ONNX model created in the last tutorial with this CSV as an input. Below is the script itself:
#!/usr/bin/env python3
import onnxruntime as rt
import onnx as ox
import numpy
import pandas as pd
import shutil
import sys
test=pd.read_csv(sys.argv[1])
X_test = test.values.astype('float32')
X_test = X_test.reshape(X_test.shape[0], 28, 28,1)
session = rt.InferenceSession(sys.argv[2])
input_name = session.get_inputs()[0].name
label_name = session.get_outputs()[0].name
prediction = session.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]
number = 0
for i in range(0, 9):
if (prediction[0][i] == 1.0):
number = i
print(number)As you can see it expects two arguments:
test=pd.read_csv(sys.argv[1]))session = rt.InferenceSession(sys.argv[2]))Thus, you will modify the following properties in the ExecuteStreamCommand processor:
${filename};${onnxModel}${root_folder}NIFI/runModel.shIf you run the flow against the image in my github, you will see 3 output flowfiles, predicting the value of the handwritten digit, like shown below: