Created on 10-05-2021 06:43 AM - edited on 07-25-2022 02:30 AM by peter_ableda
A recent update to Cloudera Machine Learning brings the ability to create custom code editors with ML runtimes. This article shows the process of creating and adding an ML Runtime to CML that uses a different editor. First, you will create a Docker image that is configured to use a custom editor, specifically RStudio, and then add it to your workspace.
Note: If you just want to use RStudio, you can skip this step and use an image that has already been uploaded:
peterableda/rstudio-cloudera-runtime:2022.04-8
You will need to have Docker installed and running to do this step. First, clone the repo and use the RStudio 1.4 directory. In a terminal window, run the following commands listed below.
$ git clone https://github.com/cloudera/community-ml-runtimes
$ cd rstudio_1.4
Now build the Docker image. You need to replace the peterableda/ tag details with the details of the container registry you need to use. The rstudio-cloudera-runtime:2022.04-8 part of the tag is up to you. The 2022.04-8 is the CalVer naming convention that we use for community images. The Dockerfile has some useful comments about the structure of the file and can help you customize it for your own requirements.
$ docker build -t peterableda/rstudio-cloudera-runtime:2022.04-8 . -f Dockerfile
The next step is to push the image to your container registry.
$ docker push peterableda/rstudio-cloudera-runtime:2022.04-8
Assuming the image push worked, you are good for the next step.
Note: This step requires that you have the CML Public Cloud - August 31 or a newer version to add a custom runtime. If you don't have the Runtime Catalog navigation item or the Runtime Catalog page doesn't have the Add Runtime button, you might not have the right version or the right permissions. Please check with whoever manages your CDP environment.
Navigate to the Runtime Catalog for the CML start page, and click Add Runtime.
In the next step, paste in the link to the image you pushed in Step 1 and click Validate. Your CML instance will need to have to access this container registry to pull the image. If this is a restricted or air-gapped installation, public container registries might not work and will require a private container registry deployed in an accessible network location. The validation process will confirm the image has the correct labels and can be imported. Click Add to Catalog when you are ready.
Assuming all went well in the last step, you should now be able to use RStudio as an editor when starting a new session. From the New Session page, select RStudio as the editor:
Once the session launches, you will see a familiar view of RStudio embedded into the CML UI, in the same way, JupyterLab is embedded.
While this process is specific to RStudio, it should work for any web-based editor that can be configured to run on a specific port.
Created on 02-20-2023 03:24 AM
@fletch_jeff can you tell me how we can install the cdsw package for R in order to launch workers in such custom R Runtime?