Support Questions
Find answers, ask questions, and share your expertise

NiFi - Content Repository configuration

Which will be better content repository configuration ?

1 8 TB disk array configured as RAID 1 with 1 mount point.

or 4 2TB disks configured as RAID 1 with 4 mount points.

Does mount point make a difference in terms of how content repository is utilized by NiFi ?

11 REPLIES 11

Thank You @james.jones. Let me see if I understand your test. Your content repository settings were

nifi.content.repository.directory.default=/mnt1/dir1
nifi.provenance.repository.directory.content1=/mnt1/dir2
nifi.provenance.repository.directory.content2=/mnt1/dir3
nifi.provenance.repository.directory.content3=/mnt1/dir4

where /mnt1 is a mountpoint with multiple physical disks. In this case, your output was distributed over 4 directories. and NiFi would utilize OS capabilities to parallel write to multiple disks.

Super Collaborator

@Shishir Saxena

In my test, I removed the default option (not sure if that is necessary), so I had

nifi.provenance.repository.directory.content1=/mnt1/dir1
nifi.provenance.repository.directory.content2=/mnt1/dir2
nifi.provenance.repository.directory.content3=/mnt1/dir3
nifi.provenance.repository.directory.content4=/mnt1/dir4

And as the /mnt1 suggests, I only have one mount point, but in my case, I only have one physical disk, but you certainly could have more than one physical disk which is transparent to Nifi. But, I'm not sure there is any advantage to using multiple directories if you're only using one mount point. I was emulating having multiple mount points as much as I could without actually creating them.

I wanted to observe whether it writes portions of a content claim (content input) across the directories (sort of a striping effect) or if it does round robin writes with the first one to content1, the second to content2, third to 3 and so forth, or if there is other behavior. It ended up being round robin. You may be able to gain some performance advantage by separating content to different mount points if they are on different spindles and you're doing a lot of concurrent R/W to content. But as others have stated, separating the other repositories to other disks provides a lot of advantage as well.