Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
avatar

In this article, we will guide you through detailed, step-by-step instructions on how administrators can create a custom runtime image for notebooks in Cloudera Machine Learning (CML), complete with custom extensions for VsCode.

We'll also provide a fully functional runtime image as an example, which you can integrate seamlessly into your CML environment.

Advantages: The following steps enable administrators to tailor a VsCode notebook by incorporating all the necessary extensions for end-users.

Disadvantages: End-users are unable to permanently install extensions directly within a session; this must be done through the custom runtime image. Consequently, while users can add extensions during an active session, these will be removed once the session ends. However, the extensions included in the runtime image will persist.

Custom docker runtime image:

 

rcicakcloudera/vscodeextensions:latest

 

Everything done below is emulated in the custom image above.

Step 1: Write Dockerfile

Below, you'll notice the installation of an extension named "redhat.vscode-debug-adapter-apache-camel." To install additional extensions, simply continue appending them to the list. 

 

FROM docker.repository.cloudera.com/cloudera/cdsw/ml-runtime-jupyterlab-python3.9-standard:2022.04.1-b6

RUN apt update && apt upgrade -y && apt clean && rm -rf /var/lib/apt/lists/*
RUN curl -fsSL https://code-server.dev/install.sh | sh -s --  --version 4.2.0
RUN printf "#!/bin/bash\n/usr/bin/code-server --auth=none --extensions-dir=/usr/bin/custom_extensions --bind-addr=127.0.0.1:8090 --disable-telemetry" > /usr/local/bin/vscode
RUN chmod +x /usr/local/bin/vscode
RUN rm -f /usr/local/bin/ml-runtime-editor
RUN ln -s /usr/local/bin/vscode /usr/local/bin/ml-runtime-editor

user cdsw
RUN mkdir /usr/bin/custom_extensions
RUN code-server --extensions-dir /usr/bin/custom_extensions --install-extension redhat.vscode-debug-adapter-apache-camel
RUN code-server --list-extensions

# Override Runtime label and environment variables metadata
ENV ML_RUNTIME_EDITOR="VsCode" \
            ML_RUNTIME_EDITION="v4.2.0" \
                ML_RUNTIME_SHORT_VERSION="1.0" \
        ML_RUNTIME_MAINTENANCE_VERSION="1" \
    ML_RUNTIME_FULL_VERSION="1.0.1" \
    ML_RUNTIME_DESCRIPTION="This runtime includes VsCode editor"

LABEL com.cloudera.ml.runtime.editor=$ML_RUNTIME_EDITOR \
      com.cloudera.ml.runtime.edition=$ML_RUNTIME_EDITION \
          com.cloudera.ml.runtime.full-version=$ML_RUNTIME_FULL_VERSION \
      com.cloudera.ml.runtime.short-version=$ML_RUNTIME_SHORT_VERSION \
      com.cloudera.ml.runtime.maintenance-version=$ML_RUNTIME_MAINTENANCE_VERSION \
      com.cloudera.ml.runtime.description=$ML_RUNTIME_DESCRIPTION

 

 

Step 2: Build Dockerfile, Tag, and then Push

 

docker build -t d .
docker tag 0812eb88e2aa rcicakcloudera/vscodeextensions:latest

 

 

docker push rcicakcloudera/vscodeextensions:latest

 

 

Step 3: Specify Docker image within CML Runtime

Step 4: Use VsCode Runtime

 

As mentioned previously, only the CML administrator creating this runtime can permanently add extensions. End-users, on the other hand, can only add new extensions temporarily during an active session. If you want to enable end-users to also have the capability to permanently install extensions, consider the following option. Be mindful of these caveats when creating a custom runtime that allows both administrators and end-users to install extensions permanently.  Shout-out to @aakulov (Oleksandr Akulov) for coming up with this!

a) The initial launch of your custom runtime in a project will require additional time as the extensions are installed in real time.

b) A race condition may occur if two users simultaneously initiate their first session using your custom runtime in the same project.

 

RUN printf "#!/bin/bash\n/usr/bin/code-server --auth=none --bind-addr=127.0.0.1:8090 --disable-telemetry" > /usr/local/bin/vscode
RUN printf "code-server --install-extension redhat.vscode-debug-adapter-apache-camel && /usr/local/bin/vscode" > /usr/local/bin/vscodemod
RUN chmod +x /usr/local/bin/vscode
RUN chmod +x /usr/local/bin/vscodemod
RUN rm -f /usr/local/bin/ml-runtime-editor
RUN ln -s /usr/local/bin/vscodemod /usr/local/bin/ml-runtime-editor

 

 

You'll group all the extensions together in a single location prior to launching the VsCode notebook. The key to this setup is the symlink between vscode and ml-runtime-editor. In our approach, we insert the extension installation before vscode is initiated. This is achieved through a custom file named vscodemod, which handles the installation of the extensions and subsequently triggers the launch of vscode.

The purpose of this article is to address the challenges associated with the non-persistent filesystem in the runtime. Any changes made during a session are lost once it ends, as everything reverts to the runtime's original state upon launching a new session. Additionally, the /home/cdsw directory, which is mounted to the EFS, cannot be modified within a runtime. As a result, any files added within the Docker runtime will be erased when a session begins. To circumvent this, administrators have two options: they can permanently add extensions by relocating the extension directory to a specified area within the runtime (outside of the /home/cdsw directory), or they can opt for real-time installation of extensions directly within the /home/cdsw directory, which remains persistent thanks to its EFS mounting. 

Big shoutouts to @pauldefusco (Paul de Fusco) and @amarinovszki (Arpad Marinovszki) for all your help hashing this solution out!

 

 

 

313 Views
0 Kudos