Member since
09-21-2015
31
Posts
59
Kudos Received
9
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2553 | 06-01-2016 12:10 PM | |
4751 | 03-08-2016 06:19 PM | |
2351 | 01-19-2016 06:18 PM | |
1973 | 12-15-2015 03:18 PM | |
4512 | 12-03-2015 10:53 PM |
03-09-2017
03:56 PM
2 Kudos
OVERVIEW Docker supplies multiple storage drivers to manage the mutable and immutable layers of images and containers. Many options exist with varying pros and cons. Out of the box, docker uses devicemapper loop-lvm. The loop-lvm storage driver is not recommended for production, but requires zero setup to leverage. When attempting to increase the base size of the mutable layer, it was observed that docker client operations slow. The alternative of using smaller base layers causes failures due to out of storage conditions. The intent of this article is to outline testing that was performed to determine sane defaults for the docker storage driver options. TESTING The following testing methodology was used:
Build the centos6 image with different combinations of base sizes and storage drivers (build) Create a container from the image (run) Stop the container (stop) Remove the container (rm) Stop docker Delete/reprovision the docker graph storage location Repeat The following scenarios were tested:
loop-lvm (xfs) direct-lvm (ext4) direct -lvm (xfs) btrfs zfs overlay aufs The following base sizes were tested:
25GB 50GB 100GB 250GB The following container operation counts were tested:
1 10 25 50 100 The tests were run on the following hardware:
Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz 4 core 12GB memory 1x SATA 1TB OS + Docker OS details:
CentOS 7.2.1511 Kernel: 3.10.0-327.4.5.el7.x86_64 docker 1.9.1 Due to docker issue 17653 cgroupfs must be used instead of systemd on CentOS 7.2
--exec-opt native.cgroupdriver=cgroupfs LOOP-LVM Notes: loop-lvm requires no up front storage configuration, and uses /var/lib/docker by default. In these tests, the docker cache directory was reconfigured to use a separate XFS mount on a SATA drive. Example setup (optional is OS disk on SATA): mkdir -p /docker/loop-xfs # path to the filesystem on SATA
Docker command: /usr/bin/docker daemon --graph=/docker/loop-xfs\ --storage-driver=devicemapper \ --storage-opt dm.basesize=${BASESIZE}G \ --storage-opt dm.loopdatasize=5000G \ --storage-opt dm.loopmetadatasize=1000GB
DIRECT-LVM Notes: direct-lvm requires that a logical volume(s) be provisioned on the docker daemon node. The logical volume is then converted to a thinpool to allow docker images and containers to be provisioned with minimal storage usage. The docker-storage-setup script typically handles the logical volume setup for RHEL/CentOS, if installing from the EPEL yum repos. However, when installing from the main docker repo, to leverage the latest version of docker, this script is not included. The docker-storage-setup script is not actually required, as the required LVM commands and docker configuration can be extracted. The instructions below do not include the auto expansion of the logical volumes, which is an additional feature supported by docker-storage-setup. The direct-lvm approach allows for using ext4 or xfs, both were tested. Example setup: pvcreate -ffy /dev/sda4 vgcreate vg-docker /dev/sda4 lvcreate -L 209708s -n docker-poolmeta vg-docker lvcreate -l 60%FREE -n docker-pool vg-docker <<< "y" lvconvert -y --zero n -c 512K --thinpool vg-docker/docker-pool --poolmetadata vg-docker/docker-poolmeta
Docker command (ext4): /usr/bin/docker daemon --storage-driver=devicemapper \ --storage-opt dm.basesize=${BASESIZE}G \ --storage-opt dm.thinpooldev=/dev/mapper/vg--docker-docker--pool\ --storage-opt dm.fs=ext4
Docker command (xfs): /usr/bin/docker daemon --storage-driver=devicemapper \ --storage-opt dm.basesize=${BASESIZE}G \ --storage-opt dm.thinpooldev=/dev/mapper/vg--docker-docker--pool\ --storage-opt dm.fs=xfs
BTRFS Notes: The docker btrfs option requires a btrfs filesystem, which has mixed support depending on OS distribution. Note that btrfs does not honor the dm.basesize setting. Each image and container is represented as a btrfs subvolume. As a result, the usable storage for docker is the total amount of storage available in the btrfs filesystem. Example setup: yum install btrfs-tools -y modprobe btrfs mkfs.btrfs -f /dev/sda4 mount /dev/sda4 /docker/btrfs
Docker command: /usr/bin/docker daemon --graph /docker/btrfs --storage-driver=btrfs
ZFS Notes: The docker zfs storage driver requires a zfs zpool to be created and mounted on the partition or disk where docker data should be stored. Snapshots (read-only) and Clones (read-write) are used to manage the images and containers. zfs does not honor, or even allow, the dm.basesize setting. As a result, the usable storage for docker is the total available space in the zpool. Running zfs on RHEL/CentOS requires the install of an unsigned kernel module. On modern PCs this is a problem as modprobe will fail due to the UEFI SecureBoot feature. The UEFI SecureBoot feature MUST be disabled via the UEFI or BIOS menu, depending on system board manufacturer. Example setup:
yum -ylocalinstall --nogpgcheck https://download.fedoraproject.org/pub/epel/7/x86_64/e/epel-release-7-5.noarch.rpm yum -y localinstall --nogpgcheck http://archive.zfsonlinux.org/epel/zfs-release.el7.noarch.rpm yum -y install kernel-devel zfs modprobe zfs mkdir -p /docker/zfs zpool destroy -f zpool-docker zpool create -f zpool-docker /dev/sda4 zfs create -o mountpoint=/docker/zfs zpool-docker/docker
Docker command: /usr/bin/docker daemon --graph=/docker/zfs \ --storage-driver=zfs
OVERLAYFS OverlayFS is a modern union filesystem that is similar to AUFS. It is layered on top of an existing filesystem such as ext4 or xfs. OverlayFS promises to be fast, but currently can not be used with RPM on RHEL/CentOS6 images or hosts. This issue is fixed in yum-utils-1.1.31-33.e17, however, this requires that all images be upgraded to use the RHEL/CentOS 7.2 image. Originally, OverlayFS was tested, but not a single image could be successfully built, using both ext4 and xfs. No results are available as part of this test to prove it's speed. Additional testing will be conducted in the future when image upgrades are feasible. OverlayFS also exhibits abnormally high inode usage, increasing the number of inodes on the backing filesystem is necessary. As a follow up, OverlayFS now functions properly with RHEL/CentOS 7.2 based images. However, it was discovered that it does not honor base size or the graph storage location. Instructions below and tools have been updated to reflect these discoveries. Example Setup: modprobe overlay # create the backing filesystem with extra inodes mkfs -t ext4 -N 131072000 /dev/sda4 rm -rf /var/lib/docker/* mount /dev/sda4 /var/lib/docker
Docker command: /usr/bin/docker daemon --storage-driver=overlay \ --storage-opt dm.fs=ext4
AUFS AUFS is the original union filesystem used with docker. It is no longer recommended for production. AUFS requires a custom built kernel with support for AUFS. As a result, this option was not tested. OverlayFS is being touted as the replacement. RESULTS This section contains the result of the testing that was performed as called out previously in this document. BASE SIZE AND DRIVER IMPACT ON BUILD TIMES Build times are erratic making it difficult to truly assess the impact of the various base size and driver combinations. Median values were used in an attempt to normalize results. The following graph shows the build times of the base size and driver combinations. Note that ZFS and BTRFS do not honor the base size parameter, therefore the size listed is for the entire backing filesystem. Summary: BTRFS was consistently faster than all other drivers, but is not yet recommended for production. Direct LVM leveraging Ext4 provided the most flexibility with minimal impact due to base size and is supported in production. DRIVER TYPE IMPACT ON OPERATIONS After building the image, the next steps were to run, stop, and remove the container based on the base image. Below are the results of those actions using a 250GB base size. The following drills down into each of the operation types to show the relative differences between storage drivers. Very little impact was found for all of the direct filesystem based approaches. However, when using the loop-lvm xfs backed driver, stop times were considerably higher. This aligns with the problem statement that the loop-lvm approach is slower at larger base sizes. BASE SIZE IMPACT ON OPERATIONS The following is a breakdown of the impact to operations as the base size increases for storage drivers that support supplying a base size. btrfs and zfs were not tested as base size is not honored by those drivers. loop-lvm xfs As seen below, base size has a direct impact on the amount of time needed to stop a container. No other operations are impacted by the base size. direct-lvm ext4 The base size increasing does not significantly impact direct filesystem approaches. Below outlines the operation times across base image sizes for the direct-lvm ext4 approach. direct-lvm xfs The base size increasing does not significantly impact direct filesystem approaches. Below outlines the operation times across base image sizes for the direct-lvm xfs approach. PARALLEL CONTAINER OPERATIONS IMPACT It is possible to execute docker run, stop, and remove operations in parallel, however, very little benefit is gained by doing so and complicates scheduling of containers. The exception to this result is the stop operation. The stop operation is responsible for a bulk of the time needed to deprovision containers. Running the stop operation in parallel will reduce the overall time needed to run, stop, and remove the container. OVERLAYFS RESULTS OverlayFS was compared to the current recommended storage drive, LVM Direct Ext4. BASE SIZE AND DRIVER IMPACT ON BUILD TIMES As seen below, OverlayFS is nearly twice as fast for build operations than LVM Direct Ext4. As previously mentioned, build times are erratic due to all of the downloads it requires, however, OverlayFS consistently beat the closest competitor. PARALLEL CONTAINER OPERATIONS IMPACT OverlayFS is faster at most operations, but only marginally. A breakdown of each operation follows to show the relative difference between OverlayFS and LVM Direct Ext4. SUMMARY Below is a summary of the pros and cons of each of the storage drivers tested:
Loop-LVM XFS
Pros
No configuration required Decent performance at small base sizes Cons
Poor performance at larger base sizes Not recommended for production Direct-LVM Ext4
Pros More performant than xfs at build, run, and stop operations Consistent performance for all tested base sizes Cons
Requires dedicated storage, as LVM logical volumes Slightly slower than xfs for remove operations Direct-LVM XFS
Pros More performant than ext4 at remove operations Consistent performance for all tested base sizes Cons
Requires dedicated storage, as LVM logical volumes Slower than ext4 at build, run, and stop operations. btrfs
Pros
No need to manage base size, docker can use all the space in the filesystem Most performant for build operations Cons
Not recommended for production Requires dedicated storage, as a btrfs filesystem zfs
Pros
No need to manage base size, docker can use all the space in the zpool Cons
Not recommended for production Requires disabling UEFI SmartBoot at the system level Requires dedicated storage, as a zfs filesystem overlayfs
Pros
Claims to be fast and efficient The “modern” union filesystem Cons
Not yet production ready Not supported with RPM + CentOS 6
Could not properly test due to this issue Potential fix available for RHEL/CentOS 7.2+ images AUFS
Pros
The original Cons
Requires a custom kernel
Could not property test due to this issue
... View more
Labels:
06-01-2016
06:15 PM
You are correct, use LVM for OS disks, but not data disks. In the end, the filesystem choice doesn't make a huge difference. ext4 everywhere would simply the overall design and allow for the ability to resize filesystems online in the future. Allocating a larger amount of storage to the OS filesystems does simplify the install. Otherwise, during the Ambari install wizard, you need to go through each of the service's configurations and change "/var/log" to one of the data disk mount points (i.e. /opt/dev/sdb as an example above). If you allocated more storage to the OS (and subsequently made /usr say 30GB and /var/log 200GB), you would not have to change as much during the Ambari install. Either approach is viable, so I would suggest discussing with your OS admin team to see if they have a preference. Also note that I'm referring to daemon logs (namenode, resource manager, etc) that end up in /var/log, versus application logs. The yarn settings you show above are for the yarn application logs and local scratch space. You want to follow that same pattern in production.
... View more
05-21-2016
11:18 PM
3 Kudos
When installing HDB/HAWQ on Sandbox, it is necessary to relocate the default Ambari postgres database to a postgres instance running on a different port. The following script performs the move in a mostly automated fashion. When prompted by ambari-server setup, select option 4 for the database configuration and fill in the details. Note that this is only intended for Sandbox. Please do not use in production. #!/usr/bin/env bash
#
# Change as needed
#
PGPORT=12346
PGDATA=/var/lib/pgsql/ambari
AMBARI_WEB_USER=admin
AMBARI_WEB_PW=admin
AMBARI_DB_NAME=ambari
AMBARI_DB_USER=ambari
AMBARI_DB_PW=bigdata
#
# Variables
#
PG_INIT_PATH=/etc/init.d/postgresql
DB_BKUP_DIR=/tmp/ambari-db-backup
AMBARI_PROPS=/etc/ambari-server/conf/ambari.properties
#
# Main
#
echo -e "\n#### Stopping ambari-server"
ambari-server stop
echo -e "\n#### Creating the pgpass file"
echo "*:*:*:$AMBARI_DB_USER:$AMBARI_DB_PW" >> $HOME/.pgpass
chmod 600 $HOME/.pgpass
echo -e "\n#### Creating database backup directory"
if [ -d $DB_BKUP_DIR ]; then
rm -rf $DB_BKUP_DIR
fi
mkdir -p $DB_BKUP_DIR
chown 777 $DB_BKUP_DIR
echo -e "\n#### Backing up ambari-server databases"
pg_dump -U $AMBARI_DB_USER -w -f $DB_BKUP_DIR/ambari.sql
echo -e "\n#### Attempting to stop postgres on port $PGPORT, if running"
service postgresql.${PGPORT} stop
echo -e "\n#### Setting up new postgres data directory"
if [ -d $PGDATA ]; then
rm -rf $PGDATA
fi
mkdir -p $PGDATA
chown postgres:postgres $PGDATA
echo -e "\n#### Creating new init script"
sed -e 's|^PGPORT=.*|PGPORT='$PGPORT'|g' -e 's|^PGDATA=.*|PGDATA='$PGDATA'|g' $PG_INIT_PATH > ${PG_INIT_PATH}.${PGPORT}
chmod 775 ${PG_INIT_PATH}.${PGPORT}
echo -e "\n#### Initializing new postgres instance on port $PGPORT"
service postgresql.${PGPORT} initdb
echo -e "\n#### Modify postgres config to listen on all interfaces"
sed -i "s|^#\?listen_addresses.*|listen_addresses = '*'|g" $PGDATA/postgresql.conf
echo -e "\n#### Copy existing pg_hba.conf"
cp /var/lib/pgsql/data/pg_hba.conf $PGDATA/pg_hba.conf
echo -e "\n#### Starting new postgres instance on port $PGPORT"
service postgresql.${PGPORT} start
echo -e "\n#### Creating the ambari db"
su - postgres -c "psql -p $PGPORT -c 'CREATE DATABASE ambari;' -d postgres"
echo -e "\n#### Creating the ambari db user role"
su - postgres -c "psql -p $PGPORT -c \"CREATE ROLE $AMBARI_DB_USER LOGIN PASSWORD '$AMBARI_DB_PW';\" -d ambari"
echo -e "\n#### Restoring ambari database backup"
su - postgres -c "psql -p $PGPORT -f $DB_BKUP_DIR/ambari.sql -d ambari"
echo -e "\n#### Updating jdbc config for ambari-server"
grep -v "server.jdbc" $AMBARI_PROPS >${AMBARI_PROPS}.nojdbc
echo "server.jdbc.port=$PGPORT" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.rca.driver=org.postgresql.Driver" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.rca.url=jdbc:postgresql://localhost:${PGPORT}/ambari" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.driver=org.postgresql.Driver" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.user.name=$AMBARI_DB_USER" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.postgres.schema=ambari" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.hostname=localhost" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.rca.user.passwd=/etc/ambari-server/conf/password.dat" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.rca.user.name=$AMBARI_DB_USER" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.url=jdbc:postgresql://localhost:${PGPORT}/ambari" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.user.passwd=/etc/ambari-server/conf/password.dat" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.database=postgres" >> ${AMBARI_PROPS}.nojdbc
echo "server.jdbc.database_name=ambari" >> ${AMBARI_PROPS}.nojdbc
cp ${AMBARI_PROPS}.nojdbc $AMBARI_PROPS
echo -e "\n#### Stopping existing postgres instance"
service postgresql stop
echo -e "\n#### Running ambari-server setup"
ambari-server setup
echo -e "\n#### Starting ambari-server"
service ambari-server start
... View more
Labels:
12-03-2015
09:05 PM
1 Kudo
Thanks @Shane Kumpf, it's been a while since I wanted to clarify it, totally clear now. Have you seen people using: DominantResourceCalculator? This one makes much more sense to me.
... View more
11-09-2015
04:01 PM
1 Kudo
I don't necessarily agree with this answer. We could avoid needing to change ownership through leveraging proxy users. I hope to find time to write a patch to demonstrate this. I'd also be interested in how many clusters are actually kerberos enabled. I expect it's lower than you think. Data ownership does matter and provides at least rudimentary controls when the user does not or can not enable Kerberos.
... View more
06-05-2018
08:01 PM
it's throwing me following error if I have multiple column family in my Hbase table. Does this approach works only for the single column family ?
java.lang.RuntimeException: Hive Runtime Error while closing operators: java.io.IOException: Multiple family directories found in hdfs://hadoopdev/apps/hive/warehouse/temp.db/employee_details/_temporary/0/_temporary/attempt_1527799542731_1180_r_000000_0
... View more
11-03-2015
11:56 PM
1 Kudo
Demo article has been added here: creating-hbase-hfiles-from-an-existing-hive-table
... View more
04-02-2016
06:25 PM
Maven repo http://repo.hortonworks.com seems to be down. Getting 500 status code.
... View more