Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar
Cloudera Employee

Introduction

This article demonstrates how a Machine Learning(ML) engineer can use the Model Registry Service in Cloudera Machine Learning for cataloging, versioning, and deploying models. The Model Registry can serve as a catalog for models in the DataLake, besides helping provide model lineage information for deployed models for Data and Administrators. For details on how to use the model registry and additional documentation, review the references section below.

Creating the Model Registry for the Data Lake

Model Registry is one of the core building blocks toward MLOps or Devops for Data Science workflows. It is important to note that a single model registry is created for a CDP DataLake and serves as a model catalog for all the models in the DataLake. To create a model registry, click on CML Control Plane and create a new model registry. If there is an existing model registry for that DataLake / CDP environment, you will not be allowed to create a new one. The screen below shows the creation of the Model Registry.  Please note that there may be certain differences to creation of Model Registry based on the chosen Cloud Provider (e.g. In Azure you may be asked to provide NFS Details. Refer to Cloudera documentation here for more details based on the form factor chosen)

VishRajagopalan_0-1701240658384.pngOnce the creation process is initiated, you should be able see the details of the registry creation process by clicking on the registry name and looking at the Event history logs as below:VishRajagopalan_1-1701240658033.png

Setting up access to the Model Registry

As mentioned earlier, setting  up Model Registry access differs slightly based on the type of CDP environment. Here, use a RAZ enabled environment ( CDP DataLake with access control mechanisms configured through Apache Ranger).  First, copy the machine User Workload User Name in the Model Registry Details page below:VishRajagopalan_2-1701240658346.pngHere, non-RAZ enabled development environment is used. As a first step, identify the model user and use the same to set up the access permissions for my user.

VishRajagopalan_3-1701240658397.pngThis concludes the one time setup needed for the model registry for the DataLake.  To understand how to store models in model registry and deploy them in Cloudera Machine Learning service, refer to this article.

References: 

  1. Cloudera Reference Docs
  2. Using Model Registry in Cloudera Machine Learning service
607 Views