Support Questions

Find answers, ask questions, and share your expertise

what is the right filesystem for the kafka disk

avatar

we have hadoop cluster version 2.6.4 with 3 kafka machines we are thinking what is the best filesystem for the kafka disk - sdb ( 25T ) according to the article - https://community.hortonworks.com/articles/80813/kafka-best-practices-1.html

"FileSystem Selection Kafka uses regular files on disk, and such it has no hard dependency on a specific file system. We recommend EXT4 or XFS. Recent improvements to the XFS file system have shown it to have the better performance characteristics for Kafka’s workload without any compromise in stability. Note: Do not use mounted shared drives and any network file systems. In our experience Kafka is known to have index failures on such file systems. Kafka uses MemoryMapped files to store the offset index which has known issues on anetwork file systems."

seems that they recommended on xfs file system and not ext 4

but xfs is old file system , so I am little confuse

I will happy to know more opiones on that

my sdb ( kafka disk )

df | grep sdb

NAME          MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT

sdb             8:16   0  25.5T  0 disk 
Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

Both are recommended for running Kafka (XFS or ext4). XFS typically performs well with little tuning when compared to ext4 and it has become the default filesystem for many Linux distributions.

XFS is a very high performance, scalable file system and is routinely deployed in the most demanding applications. It's RHEL 7 is the default file system and is supported on all architectures. XFS has its advantages but in a JBOD setup, it doesn't really provide a lot of benefits.

Ext4 does not scale to the same size as XFS, is fully supported on all architectures and will still continue to see active development and support.

See HCC Kafka KB Article

Hope that helps!!!

View solution in original post

7 REPLIES 7

avatar
Explorer

ext4 was the default filesystem in RHEL6

xfs is the default filesystem in RHEL7 (yes, xfs has a long history - but is that a bad thing ...? it's still current & supported, and scales higher that ext4)

If choosing between those two options - I would go with xfs - since it's probably more used now in RHEL7.

avatar

@cmcbugg what the benfits of xfs on ext4 regarding kafka disk

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Both are recommended for running Kafka (XFS or ext4). XFS typically performs well with little tuning when compared to ext4 and it has become the default filesystem for many Linux distributions.

XFS is a very high performance, scalable file system and is routinely deployed in the most demanding applications. It's RHEL 7 is the default file system and is supported on all architectures. XFS has its advantages but in a JBOD setup, it doesn't really provide a lot of benefits.

Ext4 does not scale to the same size as XFS, is fully supported on all architectures and will still continue to see active development and support.

See HCC Kafka KB Article

Hope that helps!!!

avatar

@Geoffrey dear Geoffrey can you please help me , we get - This post is currently awaiting moderation. If you believe this to be in error, contact a system administrator. on our last questions , what we can do to realese the questions ?

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

That's an annoying message I once got, but you can repost the same change the Header and delete the olf posting... it worked for me.
Try the hack 🙂


avatar

yes I try but I got the same message also if we change the header ! , and also we cant delete the old question's , I am really not understand what we can do we have crisis here ,

Michael-Bronson

avatar

who is the "system administrator." , in that case ?

Michael-Bronson