Created on 03-09-2017 03:56 PM
OVERVIEW
Docker supplies multiple storage drivers to manage the mutable and immutable layers of images and containers. Many options exist with varying pros and cons.
Out of the box, docker uses devicemapper loop-lvm. The loop-lvm storage driver is not recommended for production, but requires zero setup to leverage.
When attempting to increase the base size of the mutable layer, it was observed that docker client operations slow. The alternative of using smaller base layers causes failures due to out of storage conditions.
The intent of this article is to outline testing that was performed to determine sane defaults for the docker storage driver options.
The following testing methodology was used:
The following scenarios were tested:
The following base sizes were tested:
The following container operation counts were tested:
The tests were run on the following hardware:
OS details:
Notes:
loop-lvm requires no up front storage configuration, and uses /var/lib/docker by default. In these tests, the docker cache directory was reconfigured to use a separate XFS mount on a SATA drive.
Example setup (optional is OS disk on SATA):
mkdir -p /docker/loop-xfs # path to the filesystem on SATA |
Docker command:
/usr/bin/docker daemon --graph=/docker/loop-xfs\ --storage-driver=devicemapper \ --storage-opt dm.basesize=${BASESIZE}G \ --storage-opt dm.loopdatasize=5000G \ --storage-opt dm.loopmetadatasize=1000GB |
Notes:
direct-lvm requires that a logical volume(s) be provisioned on the docker daemon node. The logical volume is then converted to a thinpool to allow docker images and containers to be provisioned with minimal storage usage.
The docker-storage-setup script typically handles the logical volume setup for RHEL/CentOS, if installing from the EPEL yum repos. However, when installing from the main docker repo, to leverage the latest version of docker, this script is not included. The docker-storage-setup script is not actually required, as the required LVM commands and docker configuration can be extracted. The instructions below do not include the auto expansion of the logical volumes, which is an additional feature supported by docker-storage-setup.
The direct-lvm approach allows for using ext4 or xfs, both were tested.
Example setup:
pvcreate -ffy /dev/sda4 vgcreate vg-docker /dev/sda4 lvcreate -L 209708s -n docker-poolmeta vg-docker lvcreate -l 60%FREE -n docker-pool vg-docker <<< "y" lvconvert -y --zero n -c 512K --thinpool vg-docker/docker-pool --poolmetadata vg-docker/docker-poolmeta |
Docker command (ext4):
/usr/bin/docker daemon --storage-driver=devicemapper \ --storage-opt dm.basesize=${BASESIZE}G \ --storage-opt dm.thinpooldev=/dev/mapper/vg--docker-docker--pool\ --storage-opt dm.fs=ext4 |
Docker command (xfs):
/usr/bin/docker daemon --storage-driver=devicemapper \ --storage-opt dm.basesize=${BASESIZE}G \ --storage-opt dm.thinpooldev=/dev/mapper/vg--docker-docker--pool\ --storage-opt dm.fs=xfs |
Notes:
The docker btrfs option requires a btrfs filesystem, which has mixed support depending on OS distribution. Note that btrfs does not honor the dm.basesize setting. Each image and container is represented as a btrfs subvolume. As a result, the usable storage for docker is the total amount of storage available in the btrfs filesystem.
Example setup:
yum install btrfs-tools -y modprobe btrfs mkfs.btrfs -f /dev/sda4 mount /dev/sda4 /docker/btrfs |
Docker command:
/usr/bin/docker daemon --graph /docker/btrfs --storage-driver=btrfs |
Notes:
The docker zfs storage driver requires a zfs zpool to be created and mounted on the partition or disk where docker data should be stored. Snapshots (read-only) and Clones (read-write) are used to manage the images and containers. zfs does not honor, or even allow, the dm.basesize setting. As a result, the usable storage for docker is the total available space in the zpool.
Running zfs on RHEL/CentOS requires the install of an unsigned kernel module. On modern PCs this is a problem as modprobe will fail due to the UEFI SecureBoot feature. The UEFI SecureBoot feature MUST be disabled via the UEFI or BIOS menu, depending on system board manufacturer.
Example setup: yum -ylocalinstall --nogpgcheck https://download.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm yum -y localinstall --nogpgcheck http://archive.zfsonlinux.org/epel/zfs-release.el7.noarch.rpm yum -y install kernel-devel zfs modprobe zfs mkdir -p /docker/zfs zpool destroy -f zpool-docker zpool create -f zpool-docker /dev/sda4 zfs create -o mountpoint=/docker/zfs zpool-docker/docker |
Docker command:
/usr/bin/docker daemon --graph=/docker/zfs \ --storage-driver=zfs |
OverlayFS is a modern union filesystem that is similar to AUFS. It is layered on top of an existing filesystem such as ext4 or xfs. OverlayFS promises to be fast, but currently can not be used with RPM on RHEL/CentOS6 images or hosts. This issue is fixed in yum-utils-1.1.31-33.e17, however, this requires that all images be upgraded to use the RHEL/CentOS 7.2 image.
Originally, OverlayFS was tested, but not a single image could be successfully built, using both ext4 and xfs. No results are available as part of this test to prove it's speed. Additional testing will be conducted in the future when image upgrades are feasible.
OverlayFS also exhibits abnormally high inode usage, increasing the number of inodes on the backing filesystem is necessary.
As a follow up, OverlayFS now functions properly with RHEL/CentOS 7.2 based images. However, it was discovered that it does not honor base size or the graph storage location. Instructions below and tools have been updated to reflect these discoveries.
Example Setup:
modprobe overlay # create the backing filesystem with extra inodes mkfs -t ext4 -N 131072000 /dev/sda4 rm -rf /var/lib/docker/* mount /dev/sda4 /var/lib/docker |
Docker command:
/usr/bin/docker daemon --storage-driver=overlay \ --storage-opt dm.fs=ext4 |
AUFS is the original union filesystem used with docker. It is no longer recommended for production.
AUFS requires a custom built kernel with support for AUFS. As a result, this option was not tested. OverlayFS is being touted as the replacement.
This section contains the result of the testing that was performed as called out previously in this document.
Build times are erratic making it difficult to truly assess the impact of the various base size and driver combinations. Median values were used in an attempt to normalize results.
The following graph shows the build times of the base size and driver combinations. Note that ZFS and BTRFS do not honor the base size parameter, therefore the size listed is for the entire backing filesystem.
Summary: BTRFS was consistently faster than all other drivers, but is not yet recommended for production. Direct LVM leveraging Ext4 provided the most flexibility with minimal impact due to base size and is supported in production.
After building the image, the next steps were to run, stop, and remove the container based on the base image. Below are the results of those actions using a 250GB base size.
The following drills down into each of the operation types to show the relative differences between storage drivers.
Very little impact was found for all of the direct filesystem based approaches. However, when using the loop-lvm xfs backed driver, stop times were considerably higher. This aligns with the problem statement that the loop-lvm approach is slower at larger base sizes.
The following is a breakdown of the impact to operations as the base size increases for storage drivers that support supplying a base size.
btrfs and zfs were not tested as base size is not honored by those drivers.
As seen below, base size has a direct impact on the amount of time needed to stop a container. No other operations are impacted by the base size.
The base size increasing does not significantly impact direct filesystem approaches. Below outlines the operation times across base image sizes for the direct-lvm ext4 approach.
The base size increasing does not significantly impact direct filesystem approaches. Below outlines the operation times across base image sizes for the direct-lvm xfs approach.
It is possible to execute docker run, stop, and remove operations in parallel, however, very little benefit is gained by doing so and complicates scheduling of containers. The exception to this result is the stop operation. The stop operation is responsible for a bulk of the time needed to deprovision containers. Running the stop operation in parallel will reduce the overall time needed to run, stop, and remove the container.
OverlayFS was compared to the current recommended storage drive, LVM Direct Ext4.
As seen below, OverlayFS is nearly twice as fast for build operations than LVM Direct Ext4. As previously mentioned, build times are erratic due to all of the downloads it requires, however, OverlayFS consistently beat the closest competitor.
OverlayFS is faster at most operations, but only marginally. A breakdown of each operation follows to show the relative difference between OverlayFS and LVM Direct Ext4.
Below is a summary of the pros and cons of each of the storage drivers tested: