10-23-2017 05:12 AM - edited 10-23-2017 05:20 AM
I am very new to Cloudera and I would like to set up a sandbox environment. Should I get a QuickStart VM or start with Cloudera Manager and install it bymyself?
My concern is to meet security requirements, which is to be in my company's network, so I am not sure whether with the VM this can be done. Additionally there should be updates of the operating software to keep everything secure, so I would have to maintain it bymyself. Later on I will need to connect to other machines for data etc.
Can you please advice if I can use vm or better to start with Cloudera Manager? Is there anything else I should concider?
Currently I have Windows Server 2008 machine with 28 GB RAM, but instead of Windows I can get Ubuntu or CentOS.
10-23-2017 06:11 AM
welcome to Cloudera community. :) I like your idea to ask a question along with your intro message, makes it more practical :)
As for your question, Quickstart VM is definitely easier to begin with, but I would suggest you start testing in your own local VM, using for example VirtualBox.
When you get the hang of it, you can proceed with the next step, installing a (pseudo-distributed) cluster with Cloudera Manager, preferably still in local virtual machines, and as the next step I would go with company servers.
The only reason why I would go through these steps is that you have better control on your local vms, which is important for learning.
One more thing that you need to consider is that there is no way to "install" QuickStart VM on an empty, clean, machine - you just import a pre-installed, working, single-node cluster in a virtual machine.
Also, QuickStart VM is not meant for distributed, multi-node use, so I would definitely install a Cloudera Manager if you want to manage more nodes in the company infrastructure.
As for the server O.S., I would (personally) go with some form of Linux, preferably a supported distro and version, because it will make it easier, and you'll find answers to common questions that apply to your situation more easily.
I hope that answers some of your questions, but let me know if it doesn't.
10-23-2017 08:26 AM
Thank you samurai for a quick response that covers my questions! I still have some more... maybe you would be able to help me here too :)
Regarding the OS - is it possible to install Cloudera Manager on windows?
Regarding QuickStart VM - it comes with Linux, could it use the same ports and connections as the machine it runs on?
My plan is to connect it later on to Qlick View that is on different machine, I guess it's not a problem?
10-24-2017 08:16 AM
Hi again Anna,
regarding Windows question, it doesn't seem to be supported: link to CM documentation
and regarding QuickStart VM, if I understood your question correctly, you can always set up your Hypervisor (VirtualBox, VmWare, KVM, etc) to forward certain ports from your workstation to the virtual machine, or to allow communication between 2 virtual machines, for example Qlick View vm with CM vm. I've tested both of these scenarios, and I can confirm that is possible. For example, I've had a VM running Rstudio, connecting to a VM running QuickStart from Cloudera.
If I didn't understand your question correctly, feel free to give me some pointers, or more details :)
10-25-2017 01:25 AM
No problem, I'm glad if it helps. :)
With secure connection you mean SSL or Keros/Sentry? I know SSL can be done, since I've done it in production environment, but with these dev/testing local setups, I don't think I did. The same goes for Kerberos - when I tested it, I did enable those, but normally you don't need that with local virtual machines, and dev/test data.