Support Questions

kaliyugantagoni · ‎06-01-2016

Following is the prod. cluster planned infra.

Initially 4 data/compute nodes each with 2x12 cores, 256 GB
RAM and 24x2TB disks (plus 2x300 Gb for Linux).3 name/admin nodes (with much less disks configured
as RAID 1). Later, 4-5 datanodes will be added. All nodes will be having RHEL 7.

We will be proceeding with the latest 2.4 HDP installation via Ambari.

The HDP documentation has following statements :

The ext4 file system may have potential data loss
issues with default options because of the "delayed writes" feature.
XFS reportedly also has some data loss issues upon power failure. Do not use
LVM; it adds latency and causes a bottleneck

I read several existing threads and doc. but I still don't have a clear understanding of what suits in the latest editions of HDP and RHEL.

ext4-vs-xfs-filesystem-survey-of-popularity

best-practices-linux-file-systems-for-hdfs

any-recommendation-on-how-to-partition-disk-space-1 @Benjamin Leonhardi insightful recommendation

Following are the possibilities :

Have ext3 on all the partitions on all the nodes
Have ext4 on all the partitions on all the nodes
Have xfs(default file system for RHEL) on all the partitions on all the nodes
Have XFS on the boot disks (and for all disks on the head/management nodes) e.g: /boot, /var, /usr etc. but use ext3/ext4 on the data disks (that anyhow are “special” compared to our normal install images) just to minimize risk so good to stick to proposed standard practices as much as possible
Whether LVM should be used for ALL the volumes/partitions OR selectively(use for /var, /usr etc. but NOT for the datanode and log directories) OR don't use it at all

Any suggestions/recommendations/further reading(suited to the latest HDP2.4 and RHEL 7 environment) ?

skumpf · ‎06-01-2016

The HDP documentation around filesystem selection is out dated. ext4 and XFS are fine choices today.

You can use LVM for the OS filesystems. This provides a nice way to shuffle space around on your 2x 300GB OS drives as needed. XFS is perfectly fine here, so you can let RHEL use the default. However, note that XFS filesystems can not be shrunk, whereas with LVM + ext4, filesystems can be expanded and shrunk while online. This is a big gap for XFS.

For the datanode disks, do not use RAID or LVM. You want each individual disk mounted as a separate filesystem. You then provide HDFS with a comma separated list of mount points, and HDFS will handle spreading data and load across the disks. If you have 24 data disks per node, you should have 24 filesystems configured in HDFS. XFS is good choice here, since resizing is unlikely to come into play.

Also keep in mind that /var/log and /usr have specific needs. /var/log can grow to hundreds of GBs, so moving this logging to one of the data disks may be necessary. The HDP binaries are installed to /usr/hdp, and depending on which components you are installing, could use as much as 6GB per HDP release. Keep this in mind as sufficient space is needed here for upgrades.

Hope that helps.

View solution in original post

drussell · ‎06-01-2016

Hi @Kaliyug Antagonist

You can use whatever filesystem you like for the O/S filesystems etc, our recommendations are primarily targeted at the HDFS data drives.

I wouldn't use ext3 for anything any more, ext4 and xfs have moved forward as being the primary default options now.

So to try and address your options one by one:

1) No, don't do this.

2) Perfectly acceptable, take care that data drives are mounted with recommended mount options (noatime etc)

3) Also perfectly acceptable, I see more people using XFS everywhere now, ext4 less so, but the deltas are relatively small, I'd go with whichever option you're more comfortable with as an organisation.

4) I wouldn't recommend that, if you're happy using XFS, use it everywhere, it just makes things easier but see point 2) about mount options for data drives

5) You can absolutely use LVM for your O/S partitions, just ideally don't use it for datanode and log directories.

Hope that helps!

kaliyugantagoni · ‎06-01-2016

I have read about the care to be exercised while using ext4 (noatime etc) in several threads but is there some concise guide or doc. which can be used ?

skumpf · ‎06-01-2016

The HDP documentation around filesystem selection is out dated. ext4 and XFS are fine choices today.

You can use LVM for the OS filesystems. This provides a nice way to shuffle space around on your 2x 300GB OS drives as needed. XFS is perfectly fine here, so you can let RHEL use the default. However, note that XFS filesystems can not be shrunk, whereas with LVM + ext4, filesystems can be expanded and shrunk while online. This is a big gap for XFS.

For the datanode disks, do not use RAID or LVM. You want each individual disk mounted as a separate filesystem. You then provide HDFS with a comma separated list of mount points, and HDFS will handle spreading data and load across the disks. If you have 24 data disks per node, you should have 24 filesystems configured in HDFS. XFS is good choice here, since resizing is unlikely to come into play.

Also keep in mind that /var/log and /usr have specific needs. /var/log can grow to hundreds of GBs, so moving this logging to one of the data disks may be necessary. The HDP binaries are installed to /usr/hdp, and depending on which components you are installing, could use as much as 6GB per HDP release. Keep this in mind as sufficient space is needed here for upgrades.

Hope that helps.

kaliyugantagoni · ‎06-01-2016

I suspected that the doc. for the file system is merely carried forward from the previous versions, I hope Hortonworks invests some resources in upgrading it 🙂

The LVM part I guess is clear - use it for OS partitions but NOT datanodes, am I right ?

Can you help me understand more about your inputs :

XFS is perfectly fine here, so you can let RHEL use the default. However, note that XFS filesystems can not be shrunk, whereas with LVM + ext4, filesystems can be expanded and shrunk while online. This is a big gap for XFS

So what should I proceed with - ext4 everywhereORxfs everywhereORboth(xfs for datanodes etc. and ext4 for os partitions or vice versa)

so moving this logging to one of the data disks may be necessary

What is the better idea, have large a large, dedicated disk(and add more if required and resize using LVM) for the OS partition so that log, binaries etc. have aplenty space or during the HDP installation itself OR redirect logs(YARN etc.) to some directories on the disks dedicated to the datanode. For example, this is how it is in the test cluster :

skumpf · ‎06-01-2016

You are correct, use LVM for OS disks, but not data disks.

In the end, the filesystem choice doesn't make a huge difference. ext4 everywhere would simply the overall design and allow for the ability to resize filesystems online in the future.

Allocating a larger amount of storage to the OS filesystems does simplify the install. Otherwise, during the Ambari install wizard, you need to go through each of the service's configurations and change "/var/log" to one of the data disk mount points (i.e. /opt/dev/sdb as an example above). If you allocated more storage to the OS (and subsequently made /usr say 30GB and /var/log 200GB), you would not have to change as much during the Ambari install. Either approach is viable, so I would suggest discussing with your OS admin team to see if they have a preference.

Also note that I'm referring to daemon logs (namenode, resource manager, etc) that end up in /var/log, versus application logs. The yarn settings you show above are for the yarn application logs and local scratch space. You want to follow that same pattern in production.

Cloudera Community

Support Questions

HDP 2.4 installation on prod. cluster - filesystem concerns