Member since
11-16-2015
195
Posts
36
Kudos Received
16
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 3420 | 10-23-2019 08:44 PM | |
| 2965 | 09-18-2019 09:48 AM | |
| 13533 | 09-18-2019 09:37 AM | |
| 2778 | 07-16-2019 10:58 AM | |
| 3810 | 04-05-2019 12:06 AM |
12-20-2019
10:43 AM
Thanks Srinivas. All is well. Thx Yes, ipv6 is still a requirement in 1.6.1. Efforts are still undergoing to investigate options to NOT require IPv6 to be enabled in future CDSW versions.
... View more
10-24-2019
10:30 AM
@aahbs Thanks for the call today. Let's see if we can narrow those 401's to the browser level (Chrome).
... View more
10-23-2019
08:44 PM
2 Kudos
@simps In CDSW version 1.6.0 there was a wrong check in our code which failed engines if /etc/krb5.conf file was missing. We fixed it in 1.6.1. Fixed an issue where sessions on non-kerberized environments would throw the following error even though no principal was provided: Kerberos principal provided, but no krb5.conf and cluster is not Kerberized. Cloudera Bug: DSE-7236 Please see if you can upgrade to this minor release or as a workaround you can place a dummy krb5.conf in /etc/ on all CDSW hosts. Regards Amit
... View more
09-20-2019
11:51 PM
@aahbs good point. Certain organizations which makes use of firewall or proxies, can block websockets. If your browser shows problems with websockets using Chrome Developer Tools, it's likely the case. You might want to speak with your network admin and get this sorted. Regarding the extension, see if you can download the chrome extension on a machine which has internet connectivity and then scp install it manually on your laptop.
... View more
09-20-2019
08:54 AM
@aahbs these 2 lines suggests the POD is ready from k8s perspective. 2019-09-20 08:30:24.762 29 INFO Engine 76jt0ox8nexowxq5 Finish Registering running status: success 2019-09-20 08:30:24.763 29 INFO Engine 76jt0ox8nexowxq5 Pod is ready data = {"secondsSinceStartup":2.6,"engineModuleShare":2.092} Basically once the init process completes in the engine and the kernel (eg python) boots up the handler code in the engine, it directly updates the livelog status badge that the engine has transitioned from Starting to Running state. In our case this is broken which could indicate a problem with websockets. You can enable developer console in the browser to check the websocket errors. To open the Developer console in chrome, click on the Three Dots on the extreme right side of the URL bar. Then click on more tools -> developer tools -> console. To identify if the browser supports websockets and connect to, use the echo test from here https://www.websocket.org/echo.html You can also use a chrome extension which lets you connect to the livelog pod from the browser using websockets and ensures that there are no connectivity problems between the browser and CDSW’s livelog using websockets. Another thing to ensure is that you are able to resolve the wildcard subdomain from both your laptop and the server. For eg if you configured your DOMAIN in CDSW configuration as "cdsw.company.com", then a dig *.cdsw.comapny.com and a dig cdsw.company.com should return the A record correctly from both your laptop and CDSW host. You might also want to double check that there are no conflicting environment variables at the global or project level.
... View more
09-20-2019
12:27 AM
@aahbs good to hear that you are past node.js segfaults. Regarding the session stuck in launching state, start by having a look at the engine pod logs. The engine pod name will be the ID at the end of the session URL (eg in this case ilc5mjrqcy2hertx). You can then run kubectl get pods to find out the namespace that the pod is launched with kubectl get pods --all-namespaces=true | grep -i <engine ID> Followed by kubectl logs to review the logs of the engine and kinit containers kubectl logs <engineID> -n <namespace> -c engine BTW, is this a new installation or an upgrade of the existing one? Do you use kerberos and https? If TLS is enabled are you using self-signed certificates?
... View more
09-18-2019
09:48 AM
@SrJay it looks like you are running CDSW 1.6 on VMWare host which has ipv6 disabled. Can you please confirm by reviewing the dmesg for words cmdline and segfaults? If you see segmentation faults for node process and if the Cmdline shows ipv6.disabled=1, then you are likely hitting a known issue which is seen with a combination of node.js version 10.x, grpc, and ipv6 The workaround for this is to enable ipv6 on all the hosts running CDSW using the following RedHat article https://access.redhat.com/solutions/8709#rhel7enable
... View more
09-18-2019
09:37 AM
@aahbs we recently observed this with CDSW 1.6 on hosts which have ipv6 disabled. If you're hitting this behaviour please check dmesg, it would likely show segfaults on node process. We are working internally to understand the GRPC behaviour and its connection with ipv6 but in the meantime, you might want to enable ipv6 per the RedHat article https://access.redhat.com/solutions/8709#rhel7enable 1. Edit /etc/default/grub and delete the entry ipv6.disable=1 from the GRUB_CMDLINE_LINUX, like the following sample: GRUB_CMDLINE_LINUX="rd.lvm.lv=rhel/swap crashkernel=auto rd.lvm.lv=rhel/root" 2. Run the grub2-mkconfig command to regenerate the grub.cfg file: # grub2-mkconfig -o /boot/grub2/grub.cfg Alternatively, on UEFI systems, run the following: # grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg 3. Delete the file /etc/sysctl.d/ipv6.conf which contains the entry: # To disable for all interfaces
net.ipv6.conf.all.disable_ipv6 = 1
# the protocol can be disabled for specific interfaces as well.
net.ipv6.conf.<interface>.disable_ipv6 = 1 4. Check the content of the file /etc/ssh/sshd_config and make sure the AddressFamily line is commented: #AddressFamily inet 5. Make sure the following line exists in /etc/hosts, and is not commented out: ::1 localhost localhost.localdomain localhost6 localhost6.localdomain6 6. Enable ipv6 support on the ethernet interface. Double check /etc/sysconfig/network and /etc/sysconfig/network-scripts/ifcfg-* and ensure we've IPV6INIT=yes .This setting is required for IPv6 static and DHCP assignment of IPv6 addresses. 7. Stop CDSW service. 8. Reboot the CDSW hosts to enable IPv6 support. 9. Start CDSW service
... View more
07-16-2019
10:58 AM
@rssanders3 Thanks for your interest in the upcoming CDSW release >Has a more specific date been announced yet? Not yet publicly (but should be out very soon) >Specifically, will it run on 7.6? Yes
... View more
04-05-2019
12:06 AM
2 Kudos
Hello @Baris There is no such limitations from CDSW. If a node has spare resources - kubernetes could use that node to launch the pod. May I ask how many nodes are there in your CDSW cluster? What is the CPU and Memory footprint on each node, what version of CDSW are you running? And what error you are getting when launching the session with > 50% memory? You can find out how much spare resources are there cluster wide using the CDSW homepage (Dashboard). If you want to find out exactly how much spare resources are there on each node, you can find that out by running $ kubectl describe node on the CDSW master server. Example: In the snip below you can see that out of 4CPU (4000m), 3330m was used and similarly out of 8GB RAM, around 6.5 GB was used. This means if you try to launch a session with 1CPU or 2GB RAM it will not work. $ kubectl describe nodes
Name: host-aaaa
Capacity:
cpu: 4
memory: 8009452Ki
Allocatable:
cpu: 4
memory: 8009452Ki
Allocated:
CPU Requests CPU Limits Memory Requests Memory Limits
------------ ---------- --------------- -------------
3330m (83%) 0 (0%) 6482Mi (82%) 22774Mi (291%) Do note that a session can only spin an engine pod on one node. This means for eg if you have three nodes with 2 GB RAM left on each of them, it might give you an assumption that you've 6GB of free RAM and that you can launch a session with 6GB memory but because a session can't share resources across nodes you'd eventually see an error something like this "Unschedulable: No nodes are available that match all of the predicates: Insufficient memory (3)"
... View more