Created on 12-14-2018 02:40 PM - edited 09-16-2022 01:44 AM
Machine Learning and Artificial Intelligence are in the process of exploding in importance and prevalence in the enterprise. With this explosive growth comes fundamental challenges in governing model deployments ... and doing this at scale. These challenges revolve around answering the following fundamental questions:
Article: Customizing Atlas (Part1): Model governance, traceability and registry
In the previous article I showed how Atlas is a powerful and natural fit for storing and searching model and deployment metadata.
The main features of Atlas model metadata developed in the referenced article are
In this article, I present an overarching deployment framework that implements this Atlas governance of models and thus allows stakeholders to answer the above questions as the number of deployed models proliferate. Think prevalence of ML and AI one, two, five years from now.
The personas involved in the model deployment-governance framework are shown below with their actions.
Model owner: stages model artifacts in a defined structure and provides an overview of the model and project in a Read.me file.
Operations: launches automation that deploys the model, copies artifacts from staging to model registry and creates a Model entity in Atlas for this deployment
Multiple stakeholders: (data scientist, data steward, compliance, production issue troubleshooters, etc) use Atlas to answer fundamental questions about deployed models and to access concrete artifacts of those models).
Details of the deployment-governance and person interactions with it are framework are shown below.
Step 1: Model owner stages the model artifacts. This includes:
Step 2: operations deploys the model via an orchestrator automation. This automation:
Step 3: use Atlas to understand deployed models
I show below how to implement the deployment framework.
Important point: I have chosen the technologies shown below for a simple demonstration of the framework. Except for Atlas, technology implementations are your choice. For example, you could deploy your model to Spark on Hadoop instead of to a microservice, or you could use PMML instead of MLeap to serialize your model, etc.
Important point summarized: This framework is a template and, except for Atlas, the technologies are swappable.
MLeap: follow the instuctions here to set up a dockerized MLeap Runtime http://mleap-docs.combust.ml/mleap-serving/
HDP: Create a HDP cluster sandbox using these instructions
Atlas Model Type: When your HDP cluster is running, create your Atlas model type by running:
#!/bin/bash ATLAS_UU_PWD=$1 ATLAS_HOST=$2 curl -u ${ATLAS_UU_PWD} -ik -H "Content-Type: application/json" -X POST http://${ATLAS_HOST}:21000/api/atlas/v2/types/typedefs -d '{ "enumDefs": [], "structDefs": [], "classificationDefs": [], "entityDefs": [ { "superTypes": ["Process"], "name": "model", "typeVersion": "1.0", "attributeDefs": [ { "name": "qualifiedName", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "name", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "inputs", "typeName": "array<DataSet>", "isOptional": true, "cardinality": "SET", "valuesMinCount": 0, "valuesMaxCount": 2147483647, "isUnique": false, "isIndexable": false, "includeInNotification": false }, { "name": "outputs", "typeName": "array<DataSet>", "isOptional": true, "cardinality": "SET", "valuesMinCount": 0, "valuesMaxCount": 2147483647, "isUnique": false, "isIndexable": false, "includeInNotification": false }, { "name": "deploy.datetime", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "deploy.host.type", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "deploy.host.detail", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "deploy.obj.source", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "model.name", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "model.version", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "model.type", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "model.description", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "model.owner", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "model.owner.lob", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true }, { "name": "model.registry.url", "typeName": "string", "cardinality": "SINGLE", "isUnique": false, "isOptional": false, "isIndexable": true } ] } ] }'
See Customizing Atlas (Part1): Model governance, traceability and registry for details
See GitHub repo README.md for details on running: https://github.com/gregkeysquest/ModelDeployment-microservice
Main points are shown below.
See repo https://github.com/gregkeysquest/Staging-ModelDeploy-v1.0 for details.
Main points are:
model.owner = Greg Keys model.owner.lob = pricing model.name = rental pricing prediction model.type = gradient boosting regression model.version = 1.1 model.description = model predicts monthly price of rental if property is purchased model.microservice.endpoint=target
The core code for the Groovy orchestrator is shown below
//STEP 1: retrieve artifacts println "[STEP 1: retrieve artifacts] ..... downloading repo to tmp: repo=${repo} \n" processBuilder = new ProcessBuilder("shellScripts/fetchRepo.sh", repo, repoCreds, repoRoot).inheritIO().start().waitFor() //metadata aggregation println "[metadata aggregation] ..... gathering model metadata from repo \n " ModelMetadata.loadModelMetadata(repo,localRepo) //STEP 2: deploy serialized model def modelExecutable=new File("tmp/${repo}/executable").listFiles()[0].getName() println "[STEP 2: deploy serialized model] ..... deploying model to microservice: modelToDeploy=${modelExecutable} \n " processBuilder = new ProcessBuilder("shellScripts/deployModel.sh", repo, deployHostPort, modelExecutable).inheritIO().start().waitFor() //STEP 3: put artifacts to registry def modelRegistryPath="hdfs://${hdfsHostName}:8020${hdfsRegistryRoot}/${repo}" println "[STEP 3: put artifacts to registry] ..... copying tmp to model registry: modelRegistryPath=${modelRegistryPath} \n " processBuilder = new ProcessBuilder("shellScripts/pushToRegistry.sh", repo, modelRegistryPath, devMode.toString()).inheritIO().start().waitFor() //metadata aggregation println "[metadata aggregation] ..... gathering model deploy metadata \n " ModelMetadata.loadDeployMetadata(modelRegistryPath, modelExecutable, deployHostPort, deployHostType) //STEP 4: create Atlas model entity println "[STEP 4: create Atlas model entity] ..... deploying Atlas entity to ${atlasHost} \n " processBuilder = new ProcessBuilder("shellScripts/createAtlasModelEntity.sh", atlasCreds, atlasHost, ModelMetadata.deployQualifiedName, ModelMetadata.deployName, ModelMetadata.deployDateTime, ModelMetadata.deployEndPoint, ModelMetadata.deployHostType, ModelMetadata.modelExecutable, ModelMetadata.name, ModelMetadata.type, ModelMetadata.version, ModelMetadata.description, ModelMetadata.owner, ModelMetadata.ownerLob, ModelMetadata.registryURL )
Notice
Code for processing and aggregating metadata is shown here
class ModelMetadata { static metadataFileLocation = "staging/modelMetadata.txt" static Properties props = null static repo = "" static owner = "" static ownerLob = "" static name = "" static type = "" static version = "" static description = "" static endpoint = "" static registryURL = "" static modelExecutable = "" static deployEndPoint = "" static deployHostType = "" static deployDateTime = "" static deployName = "" static deployQualifiedName = "" static void loadModelMetadata(repo, localRepo){ this.repo = repo props = new Properties() def input = new FileInputStream(localRepo +"/modelMetadata.txt") props.load(input) this.owner = props.getProperty("model.owner") this.ownerLob = props.getProperty("model.owner.lob") this.name = props.getProperty("model.name") this.type = props.getProperty("model.type") this.version = props.getProperty("model.version") this.description = props.getProperty("model.description") this.endpoint = props.getProperty("model.microservice.endpoint") } static loadDeployMetadata(modelRegistryPath, modelExecutable, deployHostPort, deployHostType) { this.deployDateTime = new Date().format('yyyy-MM-dd_HH:mm:ss', TimeZone.getTimeZone('EST'))+"EST" this.deployName = "${this.name} v${this.version}" this.deployQualifiedName = "${this.deployName}@${deployHostPort}".replace(' ', '-') this.registryURL=modelRegistryPath this.modelExecutable=modelExecutable this.deployEndPoint = "http://${deployHostPort}/${this.endpoint}" this.deployHostType = deployHostType } }
Each shell script that is called by the orchestrator is shown in the code blocks below
Step 1: fetch staging (maps to 2a in diagram)
#!/bin/bash # script name: fetchRepo.sh echo "calling fetchRepo.sh" REPO=$1 REPO_CRED=$2 REPO_ROOT=$3 # create tmp directory to store stagin cd tmp # fetch staging and unzip curl -u $REPO_CRED -L -o $REPO.zip http://github.com/$REPO_ROOT/$REPO/zipball/master/ unzip $REPO.zip # rename to simplify downstream processing mv ${REPO_ROOT}* $REPO # remove zip rm $REPO.zip echo "finished fetchRepo.sh"
Step 2: deploy model (maps to 2b in diagram)
#!/bin/bash # script name: deployModel.sh echo "starting deployModel.sh" REPO=$1 HOSTPORT=$2 EXECUTABLE=$3 # copy executable to staing to deploy to target echo "copying executable to load path with command: cp tmp/${REPO}/executable/* ../loadModel/" mkdir loadModel cp tmp/$REPO/executable/* loadModel/ # simplify special string characters Q="\"" SP="{" EP="}" # create json for curl string JSON_PATH="${SP}${Q}path${Q}:${Q}/models/${EXECUTABLE}${Q}${EP}" # create host for curl string URL="http://$HOSTPORT/model" # run curl string echo "running command: curl -XPUT -H \"content-type: application/json\" -d ${JSON_PATH} ${URL}" curl -XPUT -H "content-type: application/json" -d $JSON_PATH $URL echo "finished deployModel.sh"
Step 3: copy staging to model repository (maps to 2c in diagram)
#!/bin/bash # script name: pushToRegistry.sh ## Note: for ease of development their is a local mode to write to local file system instead of hdfs echo "calling pushToRegistry.sh" REPO_LOCAL=$1 HDFS_TARGET=$2 DEV_MODE=$3 cd tmp echo "copying localRepository=${REPO_LOCAL} to hdfs modelRegistryPath=${HDFS_TARGET}" if [ $DEV_MODE ]; then MOCK_REGISTRY="../mockedHDFSModelRegistry" echo "NOTE: in dev mode .. copying from ${REPO_LOCAL} to ${MOCK_REGISTRY}" mkdir $MOCK_REGISTRY cp -R $REPO_LOCAL $MOCK_REGISTRY/ else sudo hdfs -dfs cp $REPO_LOCAL $HDFS_TARGET fi echo "finished pushToRegistry.sh"
Step 4: create Atlas model entity (maps to 2c in diagram)
#!/bin/bash # script name: createAtlasModelEntity.sh echo "starting createAtlasModelEntity.sh" ATLAS_UU_PWD=$1 ATLAS_HOST=$2 echo "running command: curl -u ${ATLAS_UU_PWD} -ik -H \"Content-Type: application/json\" -X POST http://${ATLAS_HOST}:21000/api/atlas/v2/entity/bulk -d (ommitting json)" curl -u ${ATLAS_UU_PWD} -ik -H "Content-Type: application/json" -X POST http://${ATLAS_HOST}:21000/api/atlas/v2/entity/bulk -d '{ "entities": [ { "typeName": "model", "attributes": { "qualifiedName": "'"${3}'"", "name": "'"${4}"'", "deploy.datetime": "'"${4}"'", "deploy.host.type": "'"${5}"'", "deploy.host.detail": "'"${6}"'", "deploy.obj.source": "'"${7}"'", "model.name": "'"${8}"'", "model.type": "'"${9}"'", "model.version": "1.1", "model.description": "'"${10}"'", "model.owner": "'"${11}"'", "model.owner.lob": "'"${12}"'", "model.registry.url": "'"${13}"'" } } ] }' echo "finished createAtlasModelEntity.sh"
We have:
Remember the key point that the deployment framework presented here is generalizable: except for Atlas you can plug in your choice of technologies for the orchestration, staging, model hosting and model repository, including elaborating the framework into a formal software development framework of your choice.
Appreciation to the Hortonworks Data Science SME groups for their feedback on this idea. Particular appreciation to @Ian B and @Willie Engelbrecht for their deeper attention and interest.