- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Add file to base CDSW image build.
Created ‎03-04-2021 07:48 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
CDSW version 1.7
I need to add a file to every image that cdsw builds.
jupter_notebook_config.py will need to be placed in the in the .jupyter directory of every project.
This is because our organization needs the cdsw sessions to be culled after inactivity and jupyter notebooks prevents the built in idle timeout from functioning.
Created ‎03-16-2021 06:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are a CLOUDERA rep or a rep reads this please understand that the main reason our organization chose CDSW is because of this functionality and the ability to edit the docker build. We have been through Anaconda Enterprise, IBM Watson studio, and even tried to run JupyterHub. Our organization has 150+ data scientists/ analysts and they are all irresponsible when it comes to stopping their sessions. Anaconda was by far the worst performing product/company to work with for support.
Now with CDSW we can propagate project configuration/ tutorial scripts/ spark and hive config from the top down to all projects via the docker build and we love it.
Below are the contents of jupyter-notebook-config.py located in .juypter, it worked for us.
,
c.NotebookApp.shutdown_no_activity_timeout = 3600
c.MappingKernelManager.cull_idle_timeout = 2600
Created ‎03-16-2021 06:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
This is a pretty interesting question, at first I was going to suggest just using a copy command in your Dockerfile to copy this file over, however, I'm not totally positive that the .jupyter directory exists until you start up a CDSW session with .jupyter notebook.
Can you show me what you are adding in the config file to time out the Jupyter notebooks?
You are correct that Jupyter notebooks do not time out from the IDLE_MAXIMUM_MINUTES environment variable. R Studio sessions do not either and this has been a long running and difficult issue since Cloudera doesn't write or control this code. It looks like a lot of this is fixed in CDSW 1.9 though.
If you just want to time out. jupyter notebooks, you could try to edit the Jupyter Notebook command and add this:
NOTEBOOK_TIMEOUT_SECONDS=$(python3 -c "print(${IDLE_MAXIMUM_MINUTES}*60)") /usr/local/bin/jupyter notebook --no-browser --ip=127.0.0.1 --port=${CDSW_APP_PORT} --NotebookApp.token= --NotebookApp.allow_remote_access=True --NotebookApp.quit_button=False --log-level=ERROR --NotebookApp.shutdown_no_activity_timeout=300 --MappingKernelManager.cull_idle_timeout=${NOTEBOOK_TIMEOUT_SECONDS} -- TerminalManager.cull_inactive_timeout=${NOTEBOOK_TIMEOUT_SECONDS} --MappingKernelManager.cull_interval=60 --TerminalManager.cull_interval=60 --MappingKernelManager.cull_connected=True
This will kill Jupyter Notebooks that have been longer than IDLE_MAXIMUM_MINUTES of inactivity (default to 60 minutes.)
There are a few caveats to this, the main one being that this still wont kill Jupyter Terminals due to the version of Jupyter Notebooks that CDSW 1.7 uses. Also, users will not get a warning; their Notebook and corresponding CDSW session will just get killed.
You can try this and let me know if it works. I'm also curious about your config file that you want to add into .jupyter.
Created ‎03-16-2021 06:57 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
If you are a CLOUDERA rep or a rep reads this please understand that the main reason our organization chose CDSW is because of this functionality and the ability to edit the docker build. We have been through Anaconda Enterprise, IBM Watson studio, and even tried to run JupyterHub. Our organization has 150+ data scientists/ analysts and they are all irresponsible when it comes to stopping their sessions. Anaconda was by far the worst performing product/company to work with for support.
Now with CDSW we can propagate project configuration/ tutorial scripts/ spark and hive config from the top down to all projects via the docker build and we love it.
Below are the contents of jupyter-notebook-config.py located in .juypter, it worked for us.
,
c.NotebookApp.shutdown_no_activity_timeout = 3600
c.MappingKernelManager.cull_idle_timeout = 2600
