Community Articles

Find and share helpful community-sourced technical articles.
avatar
Contributor

A recent update to Cloudera Machine Learning brings the ability to create custom code editors with ML runtimes. This article shows the process of creating and adding an ML Runtime to CML that uses a different editor. First, you will create a Docker image that is configured to use a custom editor, specifically RStudio, and then add it to your workspace.

Step 1: Create and upload the Docker Image

Note: If you just want to use RStudio, you can skip this step and use an image that has already been uploaded:

 

peterableda/rstudio-cloudera-runtime:2022.04-8

 

You will need to have Docker installed and running to do this step. First, clone the repo and use the RStudio 1.4 directory. In a terminal window, run the following commands listed below.

 

$ git clone https://github.com/cloudera/community-ml-runtimes
$ cd rstudio_1.4

 

Now build the Docker image. You need to replace the peterableda/ tag details with the details of the container registry you need to use. The rstudio-cloudera-runtime:2022.04-8 part of the tag is up to you.  The 2022.04-8 is the CalVer naming convention that we use for community images. The Dockerfile has some useful comments about the structure of the file and can help you customize it for your own requirements.

 

$ docker build -t peterableda/rstudio-cloudera-runtime:2022.04-8 . -f Dockerfile

 

The next step is to push the image to your container registry.

 

$ docker push peterableda/rstudio-cloudera-runtime:2022.04-8

 

Assuming the image push worked, you are good for the next step.

Step 2: Add the Runtime image to CML

Note: This step requires that you have the CML Public Cloud - August 31 or a newer version to add a custom runtime. If you don't have the Runtime Catalog navigation item or the Runtime Catalog page doesn't have the Add Runtime button, you might not have the right version or the right permissions. Please check with whoever manages your CDP environment.

 

Navigate to the Runtime Catalog for the CML start page, and click Add Runtime.

peter_ableda_2-1658740950024.png

 

In the next step, paste in the link to the image you pushed in Step 1 and click Validate. Your CML instance will need to have to access this container registry to pull the image. If this is a restricted or air-gapped installation, public container registries might not work and will require a private container registry deployed in an accessible network location. The validation process will confirm the image has the correct labels and can be imported. Click Add to Catalog when you are ready.

peter_ableda_0-1658740847759.png

 

Step 3: Use RStudio

Assuming all went well in the last step, you should now be able to use RStudio as an editor when starting a new session. From the New Session page, select RStudio as the editor:

peter_ableda_3-1658741052203.png

 

Once the session launches, you will see a familiar view of RStudio embedded into the CML UI, in the same way, JupyterLab is embedded. 

peter_ableda_4-1658741162417.png

 

While this process is specific to RStudio, it should work for any web-based editor that can be configured to run on a specific port.

3,144 Views
Comments
avatar
New Contributor

@fletch_jeff can you tell me how we can install the cdsw package for R in order to launch workers in such custom R Runtime?