Member since
07-09-2015
70
Posts
29
Kudos Received
12
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
12020 | 11-23-2018 03:38 AM | |
2853 | 10-07-2018 11:44 PM | |
3553 | 09-24-2018 12:09 AM | |
5683 | 09-13-2018 02:27 AM | |
3847 | 09-12-2018 02:27 AM |
11-20-2018
02:09 AM
Hi, We have an overlay network on top of your CDSW hosts where the pods are getting their IPs from (100.66.x.x). Based on your description it seems that DNS resolution is not working from inside the container while it works on the host. This can happen when multiple nameservers are configured in /etc/resolv.conf but some of them can't resolve your clouderamaster. You could figure out what nameserver can resolve your host and drop the rest of them or make sure that all nameservers can resolve the clouderamaster. I like to use `dig @nameserver clouderamaster.com` command to test these. Regards, Peter
... View more
11-19-2018
03:59 AM
You need to make sure that forward/reverse DNS resolution works from the CDSW terminal to host where you have the YARN ResourceManager and HDFS NameNode services. You referred to this as clouderamaster.<domain>.com before. This issue is not related to the CDSW master DNS resolution, you mentioned that you are using the session terminal, as it works, the CDSW master DNS is configured properly. Regards, Peter
... View more
11-19-2018
12:45 AM
Hi, You seem to have some network connectivity issue. I have seen many variations for this, please check - that you don't have a firewall between the machines, - you resolve the hosts with DNS and not /etc/hosts https://www.cloudera.com/documentation/data-science-workbench/1-3-x/topics/cdsw_known_issues.html#networking - DNS can do forward/reverse resolution on your master hostname/ip I think if you are ok on the above, this should work. Regards, Peter
... View more
10-07-2018
11:44 PM
Hi, We are tracking an issue where the docker daemon is not picking up the proxy configuration. The easiest workaround is to download the 'nvidia/cuda:9.2-devel-ubuntu16.04' image to your local computer and move the image with the docker save; scp; docker load commands to the CDSW host. After doing this the 'nvidia-docker run' should work. Regards, Peter
... View more
09-24-2018
12:09 AM
1 Kudo
Hi Ibi, Are you hitting this on the 1.4.0 version? We are tracking a bug related to this (DSE-4293) and documented a workaround: https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_known_issues.html#security__tls_ssl Regards, Peter
... View more
09-13-2018
02:27 AM
2 Kudos
Hi, This is a known issue for the CDSW 1.3 release, please read the documentation about this: https://www.cloudera.com/documentation/data-science-workbench/1-3-x/topics/cdsw_known_issues.html#cds__py4j I also see that you are trying to create a SparkContext object which still should work but you might be better off using the new Spark 2.x interfaces. You can see a few examples here: https://www.cloudera.com/documentation/data-science-workbench/1-3-x/topics/cdsw_pyspark.html Regards, Peter
... View more
09-12-2018
02:27 AM
1 Kudo
Hi, CDSW relies on Kubernetes to allocate system resources to user workloads and Kubernetes doesn't allow sharing a GPU core like it does for CPUs. The expectations are set in the Kubernetes documentation: Containers (and pods) do not share GPUs. There’s no overcommitting of GPUs. Each container can request one or more GPUs. It is not possible to request a fraction of a GPU. from: https://kubernetes.io/docs/tasks/manage-gpus/scheduling-gpus/ Regards, Peter
... View more
04-11-2018
01:33 AM
1 Kudo
Hi, The documentation has an image explaining this: https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_dist_comp_with_Spark.html The answer is yes, if you start a Python 2 session and you create a SparkSession object there you will run the Spark application in client mode and the Spark driver will be inside the CDSW session (docker container). This is the primary use-case for CDSW. Regards, Peter
... View more
11-07-2017
04:49 AM
Hi Rob, This is a good question. CDSW gives you isolation by having project specific dependencies stored in the project folders out of the box. If you need isolated dependencies on the Cluster side (Spark Executor side) also you need to follow the steps described in this blog post: https://blog.cloudera.com/blog/2017/04/use-your-favorite-python-library-on-pyspark-cluster-with-cloudera-data-science-workbench/ The "Cloudera Data Science Workbench is not ready yet" output for the cdsw status message while everything looks good is a known issue which we currently working on. Regards, Peter
... View more
11-07-2017
12:18 AM
Hello Rob, You are trying to use the numpy library from inside a map functions iterating over RDDs. The transformation which you specify will run on the Executors which will be hosted on different machines where you have YARN NodeManager running. To make this work you need to make sure that the numpy library is installed on all of the NodeManager machines. Regards, Peter
... View more