Member since
04-25-2020
11
Posts
0
Kudos Received
0
Solutions
04-22-2021
01:59 PM
@abajwa Hi, thanks for your help in the past. Now I have a new question: I want to try adding the Amundsen open source data catalog to the environment to see how it exposes all the datasets that you've populated. It depends on the availability of LDAP or similar to recognize the user who's viewing the data in the system. Is there some local LDAP or other identity service included in this demo environment? Thanks for any pointers, -Antonio
... View more
02-04-2021
07:44 PM
Just make sure the networking is setup as required by Hadoop: https://docs.cloudera.com/cloudera-manager/7.1.1/installation/topics/cdpdc-configure-network-names.h... That was the missing detail! For anyone else who tries this in VMWare a) Do the network setup as described above on your host before you start the steps in this document b) At the end of the networking instructions I struggled with "Run host -v -t A $(hostname) and verify that the output matches the hostname command. The IP address should be the same as reported by ifconfig for eth0 (or bond0)..." I found that I needed my own DNS server within the environment to make the networking stuff finally behave as described. I set up an instance of "dnsmasq" in my Centos Linux environment -- it's compact, lightweight, included with CentOS, and took about 3 minutes to configure, following the instructions here: https://brunopaz.dev/blog/setup-a-local-dns-server-for-your-projects-on-linux-with-dnsmasq About five lines of config and I was off to the races 🙂
... View more
02-03-2021
01:28 PM
thanks for the suggestion, I will investigate at next opportunity (probably over the weekend). I am QUITE sure that I did not have the "Configure Network Names" steps done correctly when I tried several months ago, but I just couldn't figure out how to fix it. This should help quite a bit.
... View more
02-03-2021
01:15 PM
Thanks for the clarification and for your efforts overall. I tried the self-install on a Centos EC2 instance and that mostly worked. A number of the services report health issues-- HDFS shows up in CM as not starting, for example-- but surprisingly things seem to work anyway. I can run queries in the notebooks, for example, and the Ranger permissions apply. I will revisit my home environment VM and see if I have better luck. (I give it ample resources -- 24 virtual CPUs and 96GB RAM and 150GB storage-- but I still seemed to hit issues.) Are there any unusual features that the networking in EC2 provides vs. a vanilla VM? For example, the EC2 env answers at an internal AWS IP address as well as the public facing one. Do the services communicate with each other in some way that may require that? Do I need to reproduce that for my self-hosted VM to work? Do I, for example, need more than one virtual NIC?
... View more
01-27-2021
12:14 PM
Trying to get this working in a VMWare VM on an on-premise server. Running into a number of issues which I will try to troubleshoot but one thing that would help a lot, I think, is stating how much disk space is required at the outset. I have tried a couple of times and run into issues that seem to be related to not enough space on root or some other file system. I did find in a README that I probably need at least 100GB -- would have been useful to know that before I created the VM. Sorry to complain but eager to get this working, finally, after a failed attempt about six months ago.
... View more