Created on
04-15-2021
02:59 PM
- edited on
04-19-2021
03:48 AM
by
subratadas
This post covers the steps required to build a custom runtime for Cloudera Data Engineering (CDE). The process pulls the base image from container.repository.cloudera.com and builds a custom image based on the Dockerfile provided and uploads the custom image to Amazon ECR using AWS CodeBuild. All the files mentioned in this post can be downloaded from here.
aws cloudformation create-stack \
--stack-name vkar-ecr \
--template-body file://cloudformation-ecr-codebuild.yml \
--parameters file://cloudformation-parameters.json \
--tags file://cloudformation-tags.json \
--capabilities CAPABILITY_NAMED_IAM
aws cloudformation create-change-set \
--stack-name vkar-ecr \
--change-set-name change1 \
--template-body file://cloudformation-ecr-codebuild.yml \
--parameters file://cloudformation-parameters.json \
--tags file://cloudformation-tags.json \
--capabilities CAPABILITY_NAMED_IAM
aws codebuild create-project --cli-input-json file://aws-codebuild.json
aws codebuild start-build --project-name cde-ml-xgboost-build
Follow these steps to use the custom runtime image to run a job:
cde resource create --type="custom-runtime-image" \
--image-engine="spark2" \
--name="cde-runtime-ml" \
--image="123456789012.dkr.ecr.us-west-2.amazonaws.com/cde/cde-spark-runtime-2.4.5:ml-xgboost"
cde job create --type spark --name ml-scoring-job \
--runtime-image-resource-name cde-runtime-ml \
--application-file ./ml-scoring.py \
--num-executors 30 \
--executor-memory 4G \
--driver-memory 4G
cde job run --name ml-scoring-job
-------------------
Vijay Anand Karthikeyan
User | Count |
---|---|
758 | |
379 | |
316 | |
309 | |
270 |