Created on 06-06-2019 01:57 PM - edited 08-17-2019 02:18 PM
Time for the tutorial 1 of a series detailing how to go from AI to Edge!
Note: all code/files referenced in this tutorial can be found on my github, here.
This tutorial is divided in the following sections:
This is fairly straight forward to implement, as it is detailed in the official documentation.
Note: make sure that dock is signed in with your Dockerhub username/password (not email) otherwise the docker push will not work.
Go to docker hub and sign in with your account. Create a new repository as follows:
You should see something like this:
Go to a folder on your computer can create this docker file (saving it as
FROM docker.repository.cloudera.com/cdsw/engine:7 RUN pip3 install --upgrade pip RUN pip3 install keras RUN pip3 install tensorflow RUN pip3 install sklearn RUN pip3 install jupyter RUN pip3 install 'prompt-toolkit==1.0.15' RUN pip3 install onnxruntime RUN pip3 install keras2onnx
Run the following command in the folder where the file has been saved:
docker build -t YOUR_USER/YOUR_REPO:YOUR_TAG . -f Dockerfile
Run the following command on your computer:
docker push YOUR_USER/YOUR_REPO:YOUR_TAG
In CDSW 1.5, you can't add a
CMD or an
ENTRYPOINT to your docker file. Therefore, you will need to add a
.bashrc file to your CDSW project, with the following code:
processes=`ps -ef | grep jupyter | wc -l` if (( $processes == 2 )) ; then echo "Jupyter is already running!" elif (( $processes == 1 )) ; then jupyter notebook --no-browser --ip=0.0.0.0 --port=8080 --NotebookApp.token= else echo "Invalid number of processes, relaunch your session!" fi
Save this file to a github repository.
In CDSW config, use the docker hub image you created as your default engine:
In CDSW, create a new project using the github repository you just created:
Note: You can create a blank project and add the .bashrc file to it, but this automates it.
In your project, open workbench and launch a session with your custom engine. Run terminal access and Jupyter will launch. You will then see the following on your 9 dots, allowing you to run Jupyter:
The model training is very well explained in the original Kaggle article that can be found here.
A reviewed version of this notebook can be found on my github. The main thing that was added to the notebook is the publishing of the model:
# Convert into ONNX format with onnxmltools import keras2onnx onnx_model = keras2onnx.convert_keras(model, model.name) import onnx temp_model_file = 'model.onnx' onnx.save_model(onnx_model, temp_model_file)
After the notebook runs, you should see the model.onnx file created.