About bantone

bantone · ‎02-20-2020

Some customers who upgrade from CDSW 1.5.x to 1.6.x - 1.7.x will have deprecated CDSW jobs which use base image v7 which run containers as root. Base Image v8 now have enhanced security which disallow containers to be ran as root users, but rather as the cdsw user. It is generally advised to re create these jobs again using base image v8 but in certain environments this is not a feasible option. The following instructions will assist you in changing the CDSW base image for your jobs. In this case v7 → v8. To connect to the cdsw database, gain root on the master node and run kubectl exec -ti `kubectl get pods | grep Running | grep '^db' | awk '{print $1}'` -- psql -U sense Confirm the engine id select * from engine_images; Let's first explore which projects are using which engines. There are three tables involved: the "projects" table, the "engine_images" table, and the one that joins the two together, the "projects_engine_images" table. This sql statement will give you a row per project, and which engine each project is using: select proj.id, proj.name, eng.id, eng.description, eng.repository, eng.tag from projects proj inner join projects_engine_images mapping on proj.id = mapping.project_id inner join engine_images eng on mapping.engine_image_id = eng.id; You'll want to make sure that every project is using a "tag" of 8. I recommend going into the UI and doing this for each project, unless there are a lot of projects to migrate. update projects_engine_images set engine_image_id = 8 where engine_image_id != 8; Next let's look at jobs. Again there are three tables - as before there are the "projects" and "engine_images" tables, and this time we use the "jobs" table. The following sql statement will show the project - job - engine mapping for all jobs: select proj.id, proj.name, job.id, job.name, job.description, eng.id, eng.description, eng.repository, eng.tag from projects proj inner join jobs job on proj.id = job.project_id inner join engine_images eng on job.engine_image_id = eng.id; Again, you'll want to look at the engine images used, and look for jobs with an engine tag that is not 8. At this point you have a decision to make. If you just have a few jobs that are not engine version 8, you can go into the UI and delete and re-create the jobs. If you have many jobs, you'll want to edit the database directly. If you're going to change the jobs in the database, first write down the 'id' (eng.id above) of the engine you want to migrate to. Then run select proj.id, proj.name, job.id, job.name, job.description, eng.id, eng.description, eng.repository, eng.tag from projects proj inner join jobs job on proj.id = job.project_id inner join engine_images eng on job.engine_image_id = eng.id where eng.id != 8; I get this as a result: id | name | id | name | description | id | description | repository | tag ----+--------+----+---------------------+-------------+----+-------------+--------------------------------------------+----- 4 | myoder | 33 | test engine 7 job | | 13 | Seven | docker.repository.cloudera.com/cdsw/engine | 7 My job is still running engine 7. More simply, select id, name, description, engine_image_id from jobs where engine_image_id != 8; We want to change job.engine_image_id to 8. UPDATE jobs SET engine_image_id = 8 WHERE engine_image_id != 8; All the jobs are now moved to use the desired engine image.

bantone · ‎11-04-2019

At times customers want to change the default docker bridge network. Perhaps their monitoring software utilizes that range for example. The following illustrates the default IP range for docker0 # ip a show docker0 5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default link/ether 02:42:be:4d:f3:e7 brd ff:ff:ff:ff:ff:ff inet 172.17.0.1/16 scope global docker0 valid_lft forever preferred_lft forever CDSW does not use the docker0 bridge within CDSW and this interface is down. To change it however a customer may create a daemon.json file in order to create a custom docker0 range. Documentation on that is enclosed below. - https://docs.docker.com/v17.09/engine/userguide/networking/default_network/custom-docker0/ /etc/docker/daemon.json { "bip": "192.168.1.5/24", "fixed-cidr": "192.168.1.5/25" } Thereafter you would stop docker on all nodes and ensure the Docker daemon is up and ifconfig shows the right values for the docker0 interface. Once this is verified you can start CDSW again.

bantone · ‎11-04-2019

Many customers have a high number of CDSW Models they wish to deploy in their environments. Some customers have a large number of model requests coming in which would exceed the default 30 second timeout limit of these models. According the CDSW Documentation, Model Replicas are described as "The engines that serve incoming requests to the model." Models are single threaded and can only process one request at a time Replicas are utilized for models to ensure some level of load-balancing, fault tolerance, and serving multiple requests. There is a maximum deployment of 9 replicas per model. This UI limit within the model can be circumvented by scaling the model manually through Kubernetes commands. NOTE: Please perform these at your own risk. One can attempt the following to scale up their model deployment. Find model deployments. - kubectl get deployments -all-namespaces Scale select deployment. - kubectl scale deployments sample-model --replicas=10 Running `kubectl scale` will terminate the existing pod and re deploy them with an additional number of containers within that pod. The final result will look like this. NAMESPACE NAME READY STATUS RESTARTS AGE default sample-model 10/10 Running 0 23m

bantone · ‎10-09-2019

Customers at times have users that utilized CDSW in the past but are no longer in the organization. Unfortunately there are limitations within CDSW to 1) Delete those users, and 2) Delete projects associated with those users. As of CDSW 1.6 administrators can only "disable" users. Deleting projects associated with the users is a bit convoluted but possible. This requires access to the underlying PSQL database within CDSW & Kubernetes. To accomplish the deletion of projects associated with a user (in this example let's say we are removing projects from /var/lib/cdsw/current/projects/projects/0/) you will have to create two queries to execute this. 1) Create query to find which users are banned/disabled within CDSW. # kubectl exec -it $(kubectl get pods -l role=db -o jsonpath='{.items[*].metadata.name}') -- psql -P pager=off -A -t -U sense -c "SELECT username FROM users where banned='t'" >> /tmp/list_usersdisabled.tx 2) Create a query for mapping users to projects to their respective locations on disk. # kubectl exec -it $(kubectl get pods -l role=db -o jsonpath='{.items[*].metadata.name}') -- psql -P pager=off -A -t -U sense -c "select u.username, u.id as user_id, u.email, u.last_logout_at_tz as user_last_logout_at_tz, u.last_seen_at as user_last_seen_at, '/var/lib/cdsw/current/projects/projects/0/' || p.id as project_id, p.name as project_name from users u , projects p where u.id = p.user_id;" >> /tmp/mapping_projectstodisk.txt Once you have these two files, you can perform the mapping in Excel or alternatively run "grep -Ff file1 file2" to show all the projects associated with the banned users. From there you can delete them within the CDSW UI.

bantone · ‎09-30-2019

A number of customers want to run CDSW R sessions within the Jupyter Notebook. This is unsuccessful within CDSW when a customer attempts to perform this and results in the following errors depending on how you execute Jupyter. Execution halted WARNING:root:kernel 8166ed20-6142-44d1-92b8-9a0ae11777a9 restarted Error in ok_device(filename, ...) : X11 is not available Calls: <Anonymous> ... evaluate -> dev.new -> do.call -> <Anonymous> -> ok_device Execution halted By default the Jupyter R Kernel is not shipped with CDSW 1.5+. In order to get this to work you will have to install the Jupyter R Kernel manually. The instructions are below as follows: - https://irkernel.github.io/installation/ Instructions on installing Jupyter R Kernel from source. 1/2) Installing from source ZMQ You'll need zmq development headers to compile pbdZMQ (Click your OS): LINUXOS XWINDOWS R packages Start R in the same terminal, and proceed as below. You can install the packages via install.packages(c('repr', 'IRdisplay', 'IRkernel'), type = 'source') To update your source installation, repeat above step. 2/2) Making the kernel available to Jupyter The kernel spec can be installed for the current user with the following line from R: IRkernel::installspec() To install system-wide, set user to False in the installspec command: IRkernel::installspec(user = FALSE)

Online	Offline
Last Visited	‎04-08-2024 01:42 PM

Member Since	‎11-16-2017 12:31 PM
Last Visited	‎04-08-2024 01:42 PM
Posts	34

Cloudera Community

Change CDSW Jobs Base Image via DB.

Custom Docker Bridge Network

Scaling CDSW Models

Identifying & Removing Disabled User projects with...

Jupyter R Sessions within CDSW