Support Questions

Find answers, ask questions, and share your expertise

Upgrading R Version in CDSW Image to 3.6.3 or 4.0.x

avatar
New Contributor

Hi everyone

 

The CDSW base image v10 ships with R 3.5.1 (as documented in the CDSW Docs -> Pre-Installed Packages).

 

As this is an older R version, a lot of R developers would like to upgrade to a more recent version: either 3.6.3 or 4.0.x (most recent version is 4.0.2 which was just released).

 

Was anyone able to do that successfully, so that the updated R version can be used both from the Workbench - R editor and RStudio as well? What were the upgrade steps involved?

 

Thanks!

2 REPLIES 2

avatar
Master Guru

@mattematics I can imagine a way where you can clone the base image and then install/upgrade the desired packages. You can start form below blog and doc for Reference.

https://blog.cloudera.com/customizing-docker-images-in-cloudera-data-science-workbench/

https://docs.cloudera.com/documentation/data-science-workbench/1-6-x/topics/cdsw_extensible_engines....


Cheers!
Was your question answered? Make sure to mark the answer as the accepted solution.
If you find a reply useful, say thanks by clicking on the thumbs up button.

avatar
New Contributor

Hi @GangWar

 

This is what I was aiming at.

 

My current Dockerfile roughly looks as follows:

 

#Dockerfile

FROM docker.repository.cloudera.com/cdsw/engine:10

WORKDIR /tmp

ENV R_HOME=/usr/local/lib/R

RUN wget http://cran.rstudio.com/src/base/R-3/R-3.6.3.tar.gz && \
    tar xvf R-3.6.3.tar.gz && \
    cd R-3.6.3 && \
    ./configure --prefix=/usr/local --enable-R-shlib && \
    make && \
    make install && \
    rm -rf /usr/local/bin/R && \
    rm -rf /usr/local/bin/Rscript && \
    ln -s /usr/local/lib/R/bin/R /usr/local/bin/R && \
    ln -s /usr/local/lib/R/bin/Rscript /usr/local/bin/Rscript && \
    echo -e "# make libR.so visible to ld.so\n/usr/local/lib/R/lib" > /etc/ld.so.conf.d/libR.conf && \
    ldconfig && \
    cd .. && \
    rm -rf R-3.6.3.tar.gz && \
    rm -rf R-3.6.3


# the java installation is mounted at CDSW session run time - copy it to the build context here
COPY ./java /usr/lib/jvm/java-openjdk

RUN export JAVA_HOME=/usr/lib/jvm/java-openjdk && \
    R CMD javareconf
    
RUN Rscript -e "update.packages(checkBuilt=TRUE, ask=FALSE, repos='https://cloud.r-project.org')"

# remove java installation again since it is mounted at runtime
RUN rm -rf /usr/lib/jvm

 

I am successfully able to launch CDSW sessions with this container with the following editors:

  • Workbench - Python
  • Workbench - Scala
  • RStudio
  • Jupyter Notebook (even with IRkernel)

The issue remains though, that launching a session with the Workbench - R editor is not possible. The session is immediately exited and the following error visible in the log:

...
PID of main R process is 205
PID of parser R process is 207
R has exited with code 2 and signal null
Exiting with code 2
...

 

Using the base image which works with the Workbench - R editor, I checked which processes CDSW tries to launch in this case. These are the following two:

cdsw       197    51  0 08:29 ?        00:00:00 /usr/local/lib/R/bin/exec/R --sense --no-readline --args
cdsw       198    51  0 08:29 ?        00:00:00 /usr/local/lib/R/bin/Rserve --RS-socket /tmp/cdsw-rserve-x0jbdet9dfe1428o.sock --RS-source /usr/local/lib/node_modules/r-engine/lib/parse.utils.r --slave

 

Using my own image and a Workbench - Python session, I was able to successfully execute these two processes.

 

Process 1:

bash$ /usr/local/lib/R/bin/exec/R --sense --no-readline --args

WARNING: unknown option '--sense'


R version 3.6.3 (2020-02-29) -- "Holding the Windsock"
Copyright (C) 2020 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

>

 

Process 2:

bash$ /usr/local/lib/R/bin/Rserve --RS-socket /tmp/cdsw-rserve-x0jbdet9dfe1428o.sock --RS-source /usr/local/lib/node_modules/r-engine/lib/parse.utils.r --slave

Rserve started in non-daemon mode.

 

To conclude: There seems to be something behind the scenes which CDSW does when launching the Workbench - R editor, which causes it to crash, but I cannot get to the bottom of what the issue is.

 

Any ideas?