Thanks for the answer. When you say "Sandbox plays roles...", do you mean the docker container within the VM? Any idea then what does the VM actually do besides providing the platform to run docker?
... View more
I am just getting started with HDP Sandbox and Hadoop in general, so I have quite a few noob questions that I am hoping someone can kindly help with some answers. It seems HDP 2.5 Sandbox now uses docker container within VM. I discovered that, thanks to community forums, when hadoop client tools didn't work when I ssh'ed on to VM, but they did when I ssh'd into docker (port 2222). Can someone explain me the different roles that VM and the docker container plays as far as the HDP 2.5 Sandbox is concerned? Am I correct to assume that since the docker container has the client tools installed, at least it plays the role of "edge node"? Then, between the VM and the container, who plays the roles of "name node" and "data node"? Or does the container plays all the roles, and the VM is just a minimal O/S that enables running docker? Also, out of curiosity, in theory, would it not have been possible to create a sort of virtual hadoop cluster using multiple docker containers playing different nodes even on a modest hardware? I am just asking because HDP Sandbox contains just one container. I'd have thought there'd be multiple containers playing different roles. Thanks in advance!
... View more