Created 10-07-2016 02:18 PM
I like to have multiple copies of the Hortonworks Sandbox for HDP. Typically these sandbox copies are used for specific demo scenarios. This has worked well with the previous VirtualBox-based sandbox. However, I'm wondering if there are problems when doing this with the Docker-based sandbox. Is there anything shared between docker containers based on a similar image?
To get around the problem before I knew the fix, I figured I would start with a completely new sandbox. So I repeated the installation procedure above, but gave the image a new name. Here is the output of docker images:
$ docker images REPOSITORY TAG IMAGE ID CREATED SIZE sandbox-atlas latest 1ade17087a83 34 hours ago 15.34 GB sandbox latest fc813bdc4bdd 2 weeks ago 14.57 GB
There are two images. The sandbox image is the original import. The sandbox-atlas image is the new import.
Here is the output of docker ps -a:
$ docker ps -a CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES cc74035a2d71 sandbox-atlas "/usr/sbin/sshd -D" 24 hours ago Exited (255) 4 seconds ago hdp25-atlas-demo 381ff4de9d9c sandbox "/usr/sbin/sshd -D" 42 hours ago Exited (0) 35 hours ago atlas-demo
You should notice that each container is using a different image.
Is there anything shared between docker containers based on a similar, but different image?
Created 10-07-2016 02:47 PM
Regarding your question - "Is there anything shared between docker containers?", I was just reading an article on the same and found that..
Your data doesn’t live in the container, it lives in a named volume that is shared between 1-N containers that you define. You backup the data volume, and forget about the container. Optimallyyour containers are completely stateless and immutable.
Source: https://blog.docker.com/2016/03/containers-are-not-vms/
More indepth details: https://docs.docker.com/engine/userguide/storagedriver/imagesandcontainers/
Not sure, if this answers your question but thought to add it here.
Created 10-07-2016 02:47 PM
Regarding your question - "Is there anything shared between docker containers?", I was just reading an article on the same and found that..
Your data doesn’t live in the container, it lives in a named volume that is shared between 1-N containers that you define. You backup the data volume, and forget about the container. Optimallyyour containers are completely stateless and immutable.
Source: https://blog.docker.com/2016/03/containers-are-not-vms/
More indepth details: https://docs.docker.com/engine/userguide/storagedriver/imagesandcontainers/
Not sure, if this answers your question but thought to add it here.
Created 10-07-2016 05:30 PM
This was very helpful information. I was thinking in the lines of VMs, which is wrong for Docker containers. I assumed that if I copied the base image and created a container from it, then the data would not be shared between the two.
What I found out is the container for the sandbox is mounting the /hadoop directory on the "host", which is actually the Hyperkit linux VM (on Mac). So even though I used different images and containers, they were sharing a common directory on the VM.
I'm going to try working around this by mapping my container's /hadoop directory to my local mac project working directory, not the hyperkit vm, to see if that keeps the data separate.
Thank you!
Created 10-08-2016 12:13 AM
@Michael Young Thanks for sharing your findings, this is helpful.
Created 10-08-2016 01:14 AM
I wasn't able to use the local mac directory for the /hadoop mount. The /etc/init.d/startup_script run in the container attempt to chown the users/groups to things that don't exist on my Mac. So I found an alternative approach that I think is working well.
I wrote an article that addresses the shared directory issue: https://community.hortonworks.com/articles/60584/how-to-manage-multiple-copies-of-the-hdp-docker-sa....