Member since
04-28-2017
41
Posts
14
Kudos Received
11
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2248 | 09-05-2018 07:40 PM | |
4660 | 02-15-2018 08:08 PM | |
3582 | 11-21-2017 09:54 AM | |
2832 | 11-13-2017 11:52 AM | |
9327 | 07-25-2017 10:42 AM |
03-24-2019
08:06 PM
I am putting the arguments in the "Arguments" field in the "Run Experiment" UI
... View more
11-05-2018
01:17 AM
@tristanzajoncsorry to bother you again.If i am am not wrong cloudera does not provide any sandbox for cdsw along with the CDH cluster. Because I have checked the cdh sandbox(5.13) and it only contains the cdh services not cdsw. If we add the cdsw service then it will create the port conflict and that was reason cloudera asked to install the cdsw in a different node. Am i correct?
... View more
11-04-2018
08:16 AM
what did you do? did you have to edit /etc/resolv in on the individual instances?
... View more
11-04-2018
07:44 AM
do you have a guide on how to setup a proper wildcard DNS entry? I haven't worked with DNS servers before or looked at DNS configs
... View more
05-23-2018
02:30 AM
Could you share the logs ? fire the below command . cdsw logs
... View more
04-04-2018
11:29 AM
It sounds like requests is not installed on your executors. You could manually install these libraries on all executors or ship it using Spark following the techniques outlined in this blog post: https://blog.cloudera.com/blog/2017/04/use-your-favorite-python-library-on-pyspark-cluster-with-cloudera-data-science-workbench/ . Tristan
... View more
02-24-2018
05:07 AM
Tristan, thanks for adding it on your list. JDBC is indeed something that users will be able to do without admin intervention. That should help us out in the meantime. Within the R/tidyverse, the odbc package is becoming popular and although it still has few problems we are pushing users to use that approach. Within the context of making Impala available to CDSW users, would it be possible to preconfigure the container impala-shell with the cluster info? It's a small thing and we can specify all the info when using it, but explaining all that to users takes focus and time away on getting things done. In our case we will also use impala-shell in a small internal package to allow transfer of data from R back to an Impala table with a single function. Thanks! Bruno
... View more
02-21-2018
04:09 PM
Environmental variables are currently set at runtime. You can override the defaults global in Admin > Engine panel or within a project in Settings > Environment. Best, Tristan
... View more
12-14-2017
09:38 AM
@tristanzajonc wrote: The <none> indicator is not an issue -- it simply indicates that those nodes are worker nodes and don't have stateful information stored on them. Hanging engines on "ContainerCreating" typically means you have not run "cdsw enable <worker-ip>" on the master node for all your worker nodes. This whitelists the IP of your worker nodes for NFS mounts. If you have not done this, containers can hang waiting for the project mounts to become available when scheduled onto a worker node. Please let me know if running "cdsw enable" for each worker IP resolves this issue. Thanks, Tristan FYI, i was having the same issue. and this resolved it for me. Thanks
... View more
11-25-2017
10:56 PM
proxy config options. option 1 both http_proxy and no_proxy blank ERROR: The connection to the server 10.x.x.x:6443 was refused - did you specify the right host or port? option 2 http_proxy =proxyserver:8080 and no_proxy ="127.0.0.1,localhost,10.masterIP,100.66.0.1,100.66.0.1/24" ERROR: The connection to the server 10.x.x.x:6443 was refused - did you specify the right host or port? option 3 http_proxy ="" and no_proxy ="127.0.0.1,localhost,10.masterIP,100.66.0.1,100.66.0.1/24" ERROR: new below error etcd-cdswserver.local 1/1 Running 0 22m 10.x.x.x cdswserver.local kube-apiserver-cdswserver.local 1/1 Running 0 21m 10.x.x.x cdswserver.local kube-controller-manager-cdswserver.local 1/1 Running 0 22m 10.x.x.x cdswserver.local kube-dns-3913472980-xtsrm 2/3 CrashLoopBackOff 15 22m 100.66.0.2 cdswserver.local kube-proxy-2fqts 1/1 Running 0 22m 10.x.x.x cdswserver.local kube-scheduler-cdswserver.local 1/1 Running 0 22m 10.x.x.x cdswserver.local weave-net-hd8pw 2/2 Running 0 22m 10.x.x.x cdswserver.local Cloudera Data Science Workbench Pod Status WARNING: Unable to bring up pods tagged as 'k8s-app' in the kube-system cluster. WARNING: Unable to bring up kube-system cluster.: 1 WARNING: Some pods in the CDSW application are not yet up.: 1 WARNING: Application services are incomplete WARNING: Config maps are incomplete WARNING: Secrets are incomplete WARNING: Persistent volumes are incomplete WARNING: Persistent volume claims are incomplete WARNING: Ingresses are incomplete WARNING: Unable to bring up pods tagged as 'k8s-app' in the kube-system cluster. ERROR:: Cloudera Data Science Workbench is not ready yet: some system pods are not ready: 1 [admin@cdswserver ~]$ sudo cdsw logs Generating Cloudera Data Science Workbench diagnostic bundle... Checking system basics... Saving kernel parameters... Getting the list of kernel modules Getting the list of systemd units Checking validation output... Checking application configuration... Checking disks... Checking Hadoop configuration... Checking network... Checking system services... Checking Docker... Checking Kubernetes... Checking Kubelet... Checking application services... Checking cluster info... Checking app cluster info... Exporting user ids... Error from server (NotFound): configmaps "internal-config" not found error: pod name must be specified Checking system logs... Producing logs tarball...
... View more
11-23-2017
01:25 AM
Hi, I install spark1.6 and 2 in both on my cluster and work like a charm. I put topic in resolv. Thanks all,
... View more
11-13-2017
02:06 PM
Thank you both for the quick replies!
... View more
07-25-2017
12:30 PM
Thanks for your question. Standalone R and Python jobs run only on the CDSW edge nodes where we have more control over dependency management using Docker. However these jobs can push workloads into the cluster using tools like PySpark, Sparklyr, Impala, and Hive. This allows you to get full dependency management for R and Python in the edge environment while still scaling specific workloads into the cluster. There is not currently a way to run the R and Python jobs themselves under YARN. In terms of SparkR, we recommend, but do not directly support, Sparklyr instead of SparkR. I hope that is helpful. Tristan
... View more
07-25-2017
08:09 AM
1 Kudo
hi do you have a step by step detail for the Wildcarding of DNS? I am stuck at that point. I have an AWS EC2 instance on which I have installed the workbench. However i am unable to open the URL.
... View more
07-20-2017
10:21 AM
Thanks Tristan! I had found that mistake and corrected it. Thanks for your response. Regards, MG
... View more
07-11-2017
10:20 AM
at first login it should ask username/password to create new account but i am not getting that option . i am gettinig login page .
... View more
07-06-2017
10:27 AM
The analyis.py file is meant to be run within the CDSW console, not directly from the terminal. See the "Getting Started" guide within the CDSW documentation. Within CDSW, Python consoles are backed by Jupyter kernels, which have the necessary configuration to create plots. Best, Tristan
... View more
07-06-2017
10:11 AM
You need to configure a wildcard DNS entry to use CDSW. See the documentation here: https://www.cloudera.com/documentation/data-science-workbench/latest/topics/cdsw_install.html#set_up_wildcard_dns While some features will work without a wildcard, running engines, accessing the terminal, or viewing the Spark UI will not. Because these URLs are random, you cannot simply add them to your local hosts file. Thanks, Tristan
... View more
06-22-2017
07:07 AM
Support helped me. For the curious, even if there is no forgot password button you can go to cdsw.whatever.com/forgot-password and reset it.
... View more
06-16-2017
01:51 AM
Hi Tristan, The problem is solved by your suggestion. Thanks. Polly
... View more
06-14-2017
03:00 PM
For real installations, you should pull from a repository. This will ensure all the nodes in your CDSW cluster have access to the image, not just the node where you built it. Moreover, you should not assume that your Docker image store is persistent across upgrades or in long-running clusters where we may evict less used images to free space. By pushing your custom images to a repository, you will ensure that images are never deleted due to image eviction policies or other administration tasks.
... View more
06-13-2017
01:36 AM
The problem is fixed by finally creating a new VM with CDH 5.8 installed.
... View more
05-25-2017
12:39 PM
Hi Tristan, You are right, configuring it globally was much easier, but we have tenant specific queues and we want to keep them contained within their pools, which is why we needed Engine/Project specific setting. Anyways, thanks for your response. Regards, MG
... View more
05-25-2017
12:38 PM
Benassi, Within Cloudera Data Science Workbench you should be able to use almost any Python, R, or Java library you want. While we have not tested and do not support Apache Phoenix directly, you should be able to access it from within a session using the same methods you would on your local laptop. For instance, Phoenix has a Python client library here: https://phoenix.apache.org/phoenix_python.html You could also likely use the JDBC driver from R, Python, and Scala engines using your favorite database library. Best, Tristan
... View more
05-25-2017
12:31 PM
I'm glad this issue was resolved with 1.0.1. Let us know if you have any other problems.
... View more
05-01-2017
02:54 PM
Thanks for the update. > Are you looking for support on a particular environment? Yes, looking for Debian (Wheezy or Jessie) support.
... View more