<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: NiFI Server Configuration in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150987#M40561</link>
    <description>&lt;P&gt;If later you decide to add new disks you can simply cop[y your content repositories to those new disks and update the nifi.properties file repo config lines to point at the new locations.&lt;/P&gt;</description>
    <pubDate>Thu, 15 Sep 2016 04:41:05 GMT</pubDate>
    <dc:creator>MattWho</dc:creator>
    <dc:date>2016-09-15T04:41:05Z</dc:date>
    <item>
      <title>NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150980#M40554</link>
      <description>&lt;P&gt;Hi , We are trying to setup a stand alone NiFi server in our HADOOP environment on cloud and trying to determine the best configurations for it. We will have one stand alone server on-site to do site-to-site with cloud NiFi.&lt;/P&gt;&lt;P&gt;We don't have many use cases as of now and may get more in future , based on it we may go to a clustered environment.&lt;/P&gt;&lt;P&gt;we may have to load 2 TB data for a future project , keeping that in mind i am trying to figure out the suitable configurations for our servers for Number of Cores ,RAM,Hard Drive etc&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Sai&lt;/P&gt;</description>
      <pubDate>Wed, 14 Sep 2016 03:11:00 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150980#M40554</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2016-09-14T03:11:00Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150981#M40555</link>
      <description>&lt;P&gt;I am also interested to know if Nifi is processer heavy or memory heavy tool.&lt;/P&gt;</description>
      <pubDate>Wed, 14 Sep 2016 09:26:58 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150981#M40555</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2016-09-14T09:26:58Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150982#M40556</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/11732/saikrishnatarapareddy.html" nodeid="11732"&gt;@Saikrishna Tarapareddy&lt;/A&gt;&lt;P&gt;The type and size of hardware needed for Nifi are really dependent on your load. Nifi stores data on disk while processing it. So you need sufficient disk capacity for your content repository, flow file repository as well as provenance (data lineage) repository. Have you enabled archiving (I am assuming, yes). Then, for how long do you archive your data? You need space for that.&lt;/P&gt;&lt;P&gt;To your question about whether Nifi is memory intensive or processor intensive, the answer is processor. Unless, you are doing bulk loads which I think you should not, you likely want to make sure you have enough processing power. Please see following link for performance expectations.&lt;/P&gt;&lt;P&gt;&lt;A href="http://docs.hortonworks.com/HDPDocuments/HDF1/HDF-1.2/bk_Overview/content/performance-expectations-and-characteristics-of-nifi.html" target="_blank"&gt;http://docs.hortonworks.com/HDPDocuments/HDF1/HDF-1.2/bk_Overview/content/performance-expectations-and-characteristics-of-nifi.html&lt;/A&gt;&lt;/P&gt;</description>
      <pubDate>Wed, 14 Sep 2016 12:58:49 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150982#M40556</guid>
      <dc:creator>mqureshi</dc:creator>
      <dc:date>2016-09-14T12:58:49Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150983#M40557</link>
      <description>&lt;P&gt;Hi &lt;A rel="user" href="https://community.cloudera.com/users/525/mclark.html" nodeid="525"&gt;@mclark&lt;/A&gt;&lt;/P&gt;&lt;P&gt;I just read this article by you . It was very helpful.&lt;/P&gt;&lt;P&gt;&lt;A href="https://community.hortonworks.com/articles/7882/hdfnifi-best-practices-for-setting-up-a-high-perfo.html" target="_blank"&gt;https://community.hortonworks.com/articles/7882/hdfnifi-best-practices-for-setting-up-a-high-perfo.html&lt;/A&gt;&lt;/P&gt;&lt;P&gt;In your section about Hardware , i am trying to understand about this.&lt;/P&gt;&lt;P&gt;using RAID 1 array we will be able to point different repositories to different hard disk drives.??&lt;/P&gt;&lt;P&gt;if so do we need to specify the size of each disk as the one we are going to use for content repo needs to be in TBs whereas flow file repo and prov repo doesn't have to be like that...&lt;/P&gt;&lt;P&gt;anything else apart from repos needs to be on seperate drives.?&lt;/P&gt;&lt;P&gt;in your break down here each / is point to a different drive.?? &lt;/P&gt;&lt;P&gt;(1 hardware RAID 1 array)&lt;/P&gt;&lt;P&gt;RAID 1 array (This could also be a RAID 10) logical volumes:&lt;/P&gt;&lt;UL&gt;
&lt;LI&gt;-/&lt;/LI&gt;&lt;LI&gt;-/boot&lt;/LI&gt;&lt;LI&gt;-/home&lt;/LI&gt;&lt;LI&gt;-/var&lt;/LI&gt;&lt;LI&gt;-/var/log/nifi-logs &amp;lt;--&lt;EM&gt; point all your NiFi logs (logback.xml) here&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;-/opt &amp;lt;-- &lt;EM&gt;install NiFi here under a sub-directory&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;-/database-repo &amp;lt;-- &lt;EM&gt;point NiFi database repository here&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;-/flowfile-repo &amp;lt;-- &lt;EM&gt;point NiFi flowfile repository here&lt;/EM&gt;&lt;/LI&gt;&lt;/UL&gt;</description>
      <pubDate>Thu, 15 Sep 2016 01:04:11 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150983#M40557</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2016-09-15T01:04:11Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150984#M40558</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/11732/saikrishnatarapareddy.html" nodeid="11732"&gt;@Saikrishna Tarapareddy&lt;/A&gt;&lt;P&gt;The section you are referring to is an example setup for a single server:&lt;/P&gt;&lt;P&gt;CPU: 24 - 48 cores&lt;/P&gt;&lt;P&gt;Memory: 64 -128 GB&lt;/P&gt;&lt;P&gt;Hard Drive configuration:&lt;/P&gt;&lt;P&gt;(1 hardware RAID 1 array)&lt;/P&gt;&lt;P&gt;(2 or more hardware RAID 10 arrays)&lt;/P&gt;&lt;P&gt;** What falls between each "----------------" line is on a single mounted RAID/disk. A RAID can be broken up in to multiple logical volumes if desired.  If it is, each / here represents a different logical volume.  By creating logical volumes you can control how much disk space is reserved for each which is recommended.  For example you would not want excessive logging to eat up space you want reserved for your flowfile-repo.  Logical volumes allow you to control that by splitting up that single RAID into multiple logical volumes of a defined size.&lt;/P&gt;&lt;P&gt;--------------------&lt;/P&gt;&lt;P&gt;RAID 1 array (This could also be a RAID 10) containing all the following directories/logical volumes:&lt;/P&gt;&lt;UL&gt;
&lt;LI&gt;-/&lt;/LI&gt;&lt;LI&gt;-/boot&lt;/LI&gt;&lt;LI&gt;-/home&lt;/LI&gt;&lt;LI&gt;-/var&lt;/LI&gt;&lt;LI&gt;-/var/log/nifi-logs &amp;lt;--&lt;EM&gt; point all your NiFi logs (logback.xml) here&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;-/opt &amp;lt;-- &lt;EM&gt;install NiFi here under a sub-directory&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;-/database-repo &amp;lt;-- &lt;EM&gt;point NiFi database repository here&lt;/EM&gt;&lt;/LI&gt;&lt;LI&gt;-/flowfile-repo &amp;lt;-- &lt;EM&gt;point NiFi flowfile repository here&lt;/EM&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;--------------------&lt;/P&gt;&lt;P&gt;1st RAID 10 array logical volumes mounted as /cont-repo1&lt;/P&gt;&lt;UL&gt;
&lt;LI&gt;-/cont-repo1 &amp;lt;-- &lt;EM&gt;point NiFi content repository here&lt;/EM&gt;&lt;/LI&gt;&lt;/UL&gt;&lt;P&gt;---------------------&lt;/P&gt;&lt;P&gt;2nd RAID 10 array logical volumes mounted as /prov-repo1&lt;/P&gt;&lt;P&gt;- /prov-repo1 &amp;lt;-- &lt;EM&gt;point NiFi provenance repository here&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;---------------------&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;3rd RAID 10 array logical volumes (recommended) mounted as /cont-repo2&lt;/P&gt;&lt;P&gt;- / cont-repo2 &amp;lt;-- &lt;EM&gt;point 2nd NiFI content repository here&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;&lt;EM&gt;----------------------&lt;/EM&gt;&lt;/P&gt;&lt;P&gt;In order to setup the above example you would need 14 hard disks&lt;/P&gt;&lt;P&gt;(2) Raid 1&lt;/P&gt;&lt;P&gt;(4) Raid 10 (x3)  * You would only need 10 disks if you decided to have only one Raid 10 content repo disk( but it would need to be 2 TB)  You could also take a large Raid 10 like the with prov-repo1 and split it into multiple logical volumes giving away part of that RAID's disk space to content repo.&lt;/P&gt;&lt;P&gt;Not sure what you mean by "load 2TB of data for future project"?  Are you saying you want NiFi to be able handle a queue backlog of 2TB of data?  If that is the case each of your cont-repo Raid 10s would need to be at least 1TB in size.  &lt;/P&gt;&lt;P&gt;***While the nifi.properties file has a single line for the content and and provenance repo path, multiple repos can be added by adding additional new lines to this file as follows:

nifi.content.repository.directory.default=/cont-repo1/content_repository
&lt;/P&gt;&lt;P&gt;nifi.content.repository.directory.cont-repo2=/cont-repo2/content_repository&lt;/P&gt;&lt;P&gt;nifi.content.repository.directory.cont-repo3=/cont-repo3/content_repository&lt;/P&gt;&lt;P&gt;etc...&lt;/P&gt;&lt;P&gt;nifi.provenance.repository.directory.default=./provenance_repository&lt;/P&gt;&lt;P&gt;nifi.provenance.repository.directory.prov-repo1=/prov-repo1/provenance_repository&lt;/P&gt;&lt;P&gt;nifi.provenance.repository.directory.prov-repo2=/prov-repo2/provenance_repository&lt;/P&gt;&lt;P&gt;etc....&lt;/P&gt;&lt;P&gt;When more then one repo is defined in the nifi.properties file, NiFi will perform file based striping across them.  This allows NiFi to spread out the I/O across multiple disk helping improve overall performance.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Thu, 15 Sep 2016 03:46:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150984#M40558</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2016-09-15T03:46:53Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150985#M40559</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/525/mclark.html" nodeid="525"&gt;@mclark&lt;/A&gt; Thank you Matt. We are trying to setup a standalone server.&lt;/P&gt;&lt;P&gt;lets say if we cant afford RAID 10 disks. What are the alternate better options. Can we go with RAID 1 for all.?? and in future if we decided to go for RAID 10 can we easily change the config files and migrate.?&lt;/P&gt;</description>
      <pubDate>Thu, 15 Sep 2016 04:22:07 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150985#M40559</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2016-09-15T04:22:07Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150986#M40560</link>
      <description>&lt;P&gt;RAID 1 is fine&lt;/P&gt;</description>
      <pubDate>Thu, 15 Sep 2016 04:39:25 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150986#M40560</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2016-09-15T04:39:25Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150987#M40561</link>
      <description>&lt;P&gt;If later you decide to add new disks you can simply cop[y your content repositories to those new disks and update the nifi.properties file repo config lines to point at the new locations.&lt;/P&gt;</description>
      <pubDate>Thu, 15 Sep 2016 04:41:05 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150987#M40561</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2016-09-15T04:41:05Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150988#M40562</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/525/mclark.html" nodeid="525"&gt;@mclark&lt;/A&gt; ,Thank you. also what happens if the server breaks down or we loose the disks because of corruptions or \ failures.?? how do we get back NiFi to its previous state.? can we persist NiFi's state to any database.? i read some where that if we can keep a back up of /conf folder , we should be in good shape .?&lt;/P&gt;&lt;P&gt;any help here is much appreciated.&lt;/P&gt;</description>
      <pubDate>Thu, 15 Sep 2016 20:53:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150988#M40562</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2016-09-15T20:53:40Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150989#M40563</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11732/saikrishnatarapareddy.html" nodeid="11732"&gt;@Saikrishna Tarapareddy&lt;/A&gt;&lt;/P&gt;&lt;P&gt;The purpose of using a RAID is to protect against the loss of a disk.  If the intent here is to protect against a complete catastrophic loss of the system, there are somethings you can do.&lt;/P&gt;&lt;P&gt;Keeping a backup of the conf directory will allow you to quickly restore the sate of your NiFi's dataflow. Restoring the state of your dataflow does not restore any data that may have been active in the system at the time of failure.&lt;/P&gt;&lt;P&gt;The NiFi repos contain the following information:&lt;/P&gt;&lt;P&gt;Database repository --&amp;gt; Contains change history to the graph  (Keep record of all changes made on the canvas).  If NiFi is secured, this repo also contains the users db.  Loss if either of these has little impact. Loss of configuration history will not impact your dataflow or data.  The users db is rebuilt from the authorized-users.xml file (located in conf dir by default) upon NiFi start.&lt;/P&gt;&lt;P&gt;Provenance repository(s) --&amp;gt;  Contains NiFi FlowFile lineage history.  Loss of this repo will not affect your dataflow or data.  You will simply be unable to perform queries against data that traversed the system previous to the loss.&lt;/P&gt;&lt;P&gt;FlowFile repository  --&amp;gt;  Loss of this Repos will result in loss of data. The FlowFile repo keeps all attributes about Content currently in the dataflow.  This includes where to find the actual content in the content repository(s).  The information in this repo changes rapidly so backing up this repo is not really feasible.  Raid offers your best protection here.&lt;/P&gt;&lt;P&gt;Content repository(s)  --&amp;gt; Loss of this repo will also result in loss of data and archived data (If configured to archive).  The content repository(s) contain the actual content of the data NiFi processes. The data in this repo also changes rapidly as files are processed through the NiFi dataflow(s), so backing up this repo(s) is also not feasible.  Raid offers your best protection here as well.&lt;/P&gt;&lt;P&gt;As you can see recover from disk failure is possible with RAID; however, a catastrophic loss of the entire system will result in loss of the data that was currently in mid processing by any of the dataflows.&lt;/P&gt;&lt;P&gt;Your Repos could be external attached storage.  (There is likely to be some performance impact because of this; however, in the event of catastrophic server loss a new server could be stood-up using the backed-up conf dir and attached to the same external storage.  This would help prevent data loss and allow processing to pickup where it left off.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Thu, 15 Sep 2016 21:28:18 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150989#M40563</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2016-09-15T21:28:18Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150990#M40564</link>
      <description>&lt;P&gt;@mclark,&lt;/P&gt;&lt;P&gt;can you please input your thoughts..&lt;/P&gt;&lt;P&gt;If we go with eight(8) 600 GB RAID 10 disks. I think we only get 4 disks for storage out of 8 as RAID 10 will use the other 4 for mirroring and stripping. &lt;/P&gt;&lt;P&gt;Can we point 3 of them to content repo and 1 to provenance repo.? so that makes 1.8 TB for content repo.&lt;/P&gt;&lt;P&gt;so our configuration looks like &lt;/P&gt;&lt;P&gt;16 core 128 MB RAM Server&lt;/P&gt;&lt;P&gt;One disk 600 GB RAID 1 for OS and Nifi Software (flowfile,data repo etc)&lt;/P&gt;&lt;P&gt;One disk 600 GB RAID 10 for Provenance repo&lt;/P&gt;&lt;P&gt;Three disks 600 GB RAID 10 for Content repo&lt;/P&gt;&lt;P&gt;My main doubt was since RAID 10 comes in 4 disks (2 for storage and 2 for mirroring).i am sure we can use one set 1.2 TB for cont repo. 
Out of the remaining 2 disks (second set) ,can i point one to cont repo and another one to prov repo.??&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Tue, 11 Oct 2016 02:12:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150990#M40564</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2016-10-11T02:12:22Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150991#M40565</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11732/saikrishnatarapareddy.html" nodeid="11732"&gt;@Saikrishna Tarapareddy&lt;/A&gt;&lt;/P&gt;&lt;P&gt;Since RAID 1 requires a minimum of 2 disks and RAID 10 requires a minimum of 4 disks.  You can build either:&lt;/P&gt;&lt;P&gt;a. (2) RAID 10&lt;/P&gt;&lt;P&gt;b. (2) RAID 1 and (1) RAID 10&lt;/P&gt;&lt;P&gt;or&lt;/P&gt;&lt;P&gt;c. (4) RAID 1&lt;/P&gt;&lt;P&gt;My recommendation for you would be to provision your (8) 600GB disks as follows:&lt;/P&gt;&lt;P&gt;- Provision your 8 disks in to (4) RAID 1 (2 disks: 600 GB + 600 GB mirrored (Total capacity 600 GB)) configurations.&lt;/P&gt;&lt;P&gt;--------------&lt;/P&gt;&lt;P&gt;(1) RAID 1 (~600 GB capacity) with the following mounted logical volumes:&lt;/P&gt;&lt;P&gt;100 - 150 GB --&amp;gt; /var/log/nifi&lt;/P&gt;&lt;P&gt;100 GB --&amp;gt; /opt/nifi/flowfile_repo&lt;/P&gt;&lt;P&gt;50 GB --&amp;gt; /opt/nifi/database_repo&lt;/P&gt;&lt;P&gt;remainder --&amp;gt; /&lt;/P&gt;&lt;P&gt;(1) RAID 1 (~600 GB capacity) with the following mounted logical volumes:&lt;/P&gt;&lt;P&gt;Entire RAID as single logical volume --&amp;gt; /opt/nifi/provenance_repo&lt;/P&gt;&lt;P&gt;(1) RAID 1 (~600 GB capacity) with the following mounted logical volumes:&lt;/P&gt;&lt;P&gt;Entire RAID as single logical volume --&amp;gt; /opt/nifi/content_repo1&lt;/P&gt;&lt;P&gt;(1) RAID 1 (~600 GB capacity) with the following mounted logical volumes:&lt;/P&gt;&lt;P&gt;Entire RAID as single logical volume --&amp;gt; /opt/nifi/content_repo2&lt;/P&gt;&lt;P&gt;---------------&lt;/P&gt;&lt;P&gt;The above will give you ~1.2TB of content_repo storage and ~600GB of Provenance history storage.&lt;/P&gt;&lt;P&gt;If provenance history is not as important to you, you could carve off another logical volume on the first RAID 1 for your provenance_repo and allocate all (3) remaining RAID 1 for content repositories. &lt;/P&gt;&lt;P&gt;*** Note: NIFi can be configured to use multiple content repositories in the nifi.properties file:&lt;/P&gt;&lt;P&gt;nifi.content.repository.directory.&lt;STRONG&gt;default&lt;/STRONG&gt;=/opt/nifi/&lt;STRONG&gt;content_repo1&lt;/STRONG&gt;/content_repository  &amp;lt;-- This line exists already&lt;/P&gt;&lt;P&gt;nifi.content.repository.directory.&lt;STRONG&gt;repo2&lt;/STRONG&gt;=/opt/nifi/&lt;STRONG&gt;content_repo2&lt;/STRONG&gt;/content_repository    &amp;lt;-- This line would be manually added.&lt;/P&gt;&lt;P&gt;nifi.content.repository.directory.&lt;STRONG&gt;repo3&lt;/STRONG&gt;=/opt/nifi/&lt;STRONG&gt;content_repo3&lt;/STRONG&gt;/content_repository    &amp;lt;-- This line would be manually added.&lt;/P&gt;&lt;P&gt;*** NiFi will do file based striping across all content repos.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Tue, 11 Oct 2016 02:45:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150991#M40565</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2016-10-11T02:45:51Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150992#M40566</link>
      <description>&lt;P&gt;@mclark&lt;/P&gt;&lt;P&gt;Let's say if we have 10 disks , 2 RAid 1 and 8 RAID 10..all 600 gb &lt;/P&gt;&lt;P&gt;then can I have 3 RAID 10s for content report&lt;/P&gt;&lt;P&gt;1 RAID 10 FOR prov repo&lt;/P&gt;&lt;P&gt;And 1 RAID 1 for /var/log/nifi,/opt/nifi/flowfile_repo, /opt/nifi/database_repo&lt;/P&gt;&lt;P&gt;I am trying to see how best we can configure&lt;/P&gt;&lt;P&gt; with 8 RAID 10 disks and 2 RAid 1 disks..I can add more if need be..&lt;/P&gt;</description>
      <pubDate>Tue, 11 Oct 2016 04:42:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150992#M40566</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2016-10-11T04:42:14Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150993#M40567</link>
      <description>&lt;P&gt;@mclark&lt;/P&gt;&lt;P&gt;Thanks for your help on this so far.. we are almost there..&lt;/P&gt;&lt;P&gt;does content repository holds all files that are processed during the time period specified in the nifi.conf setting.? or will it only hold the files that have failed or errors in processing.??&lt;/P&gt;&lt;P&gt;if it holds all what is the use of holding all the files that are processed successfully.? is there a setting that i can set to delete files that are processed sucessfully.??&lt;/P&gt;&lt;P&gt;Regards,&lt;/P&gt;&lt;P&gt;Sai&lt;/P&gt;</description>
      <pubDate>Tue, 11 Oct 2016 21:34:14 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150993#M40567</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2016-10-11T21:34:14Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150994#M40568</link>
      <description>&lt;A rel="user" href="https://community.cloudera.com/users/11732/saikrishnatarapareddy.html" nodeid="11732"&gt;@Saikrishna Tarapareddy&lt;/A&gt;&lt;P&gt;The retention settings in the nifi.properties file are for NiFi data archive only.  They do not apply to files that are active (queued or still being processed) in any of your dataflows.  NiFi will allow you to continue to queue data in your dataflow all the way up to the point where your content repository disk is 100% utilized.  That is why backpressure on dataflow connections throughout your dataflow is important to control the amount of FlowFiles that can be queued.  Also important to isolate the content repository from other NiFi repositories so if it fills the disk, it does not cause corruption of those other repositories.&lt;/P&gt;&lt;P&gt;If content repository archiving is enabled&lt;/P&gt;&lt;PRE&gt;nifi.content.repository.archive.enabled=true&lt;/PRE&gt;&lt;P&gt;then the retention and usage percentage settings in the nifi.properties file take affect.  NiFi will archive FlowFiles once they are auto-terminated at the end of a dataflow. Data active your dataflow will always take priority over archived data. If your dataflow should queue to the point your content repository disk is full, the archive will be empty.&lt;/P&gt;&lt;P&gt;The purpose of archiving data is to allow users to replay data from any point in the dataflow or be able to download and examine the content of a FlowFile post processing through a dataflow via the NiFi provenance UI.   For many this is a valuable feature and to other not so important.  If is not important for your org to archive any data, you can simply set archive enabled to false.&lt;/P&gt;&lt;P&gt;FlowFiles that are not processed successfully within your dataflow are routed to failure relationships.  As long as you do not auto-terminate any of your failure relationships, the FlowFiles remain active/queued in your dataflow.  You can then build some failure handling dataflow if you like to make sure you do not lose that data.&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Tue, 11 Oct 2016 22:09:40 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150994#M40568</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2016-10-11T22:09:40Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150995#M40569</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/525/mclark.html" nodeid="525" target="_blank"&gt;@mclark&lt;/A&gt;&lt;/P&gt;&lt;P&gt;so when the flow file is archived after auto termination at the end of the data flow , it is moved inside content repository from a folder to folder\archive.??&lt;/P&gt;&lt;P&gt;&lt;/P&gt;&lt;P&gt;&lt;span class="lia-inline-image-display-wrapper lia-image-align-inline" image-alt="8436-uzgck.png" style="width: 181px;"&gt;&lt;img src="https://community.cloudera.com/t5/image/serverpage/image-id/21201i84882C7B49BA84E2/image-size/medium?v=v2&amp;amp;px=400" role="button" title="8436-uzgck.png" alt="8436-uzgck.png" /&gt;&lt;/span&gt;&lt;/P&gt;</description>
      <pubDate>Sun, 18 Aug 2019 12:57:52 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150995#M40569</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2019-08-18T12:57:52Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150996#M40570</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11732/saikrishnatarapareddy.html" nodeid="11732"&gt;@Saikrishna Tarapareddy&lt;/A&gt; &lt;/P&gt;&lt;P&gt;Almost...  NiFi stores FlowFile content in claims.  A claim can contain 1 to many FlowFile's content.  Claims allow NiFi to use large disk more efficiently when dealing with small content files. These claims will only be moved in to the archive directory once every FlowFile associated to that claim has beed auto-terminated in the dataflow(s).  Also keep in mind that you can have multiple FlowFiles pointing at the same content (This happens for example when you connect the same relationship multiple times from a processor). Let say you routed a success relationship twice off of an updateAttribute processor. NiFi does not replicate the content, but rather create another FlowFile that points at that same content.  So both those FlowFiles now need to reach an auto-termination point before that content claim would be moved to archive.&lt;/P&gt;&lt;P&gt;The content claims are defined in the nifi.properties file:&lt;/P&gt;&lt;PRE&gt;nifi.content.claim.max.appendable.size=10 MB
nifi.content.claim.max.flow.files=100
&lt;/PRE&gt;&lt;P&gt;The above are the defaults. &lt;/P&gt;&lt;P&gt;If a file comes in at less then 10 MB in size, NIFi will try to append it to the next file(s) unless the combination of those files were to exceed the 10 MB max or the claim has already reach 100 files.&lt;/P&gt;&lt;P&gt;If a file comes in that is larger then 10 MB it ends up in a claim all by itself.&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Wed, 12 Oct 2016 00:04:16 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150996#M40570</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2016-10-12T00:04:16Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150997#M40571</link>
      <description>&lt;P style="margin-left: 20px;"&gt;&lt;A rel="user" href="https://community.cloudera.com/users/525/mclark.html" nodeid="525"&gt;@mclark&lt;/A&gt; ,&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;looks like we have finally decided to go with the below configuration. Please let me know if you see any thing alarming. I don't think 1.2 TB is needed for flow file , but with RAID 10 that the default. I will ask our vendor to see if we can some space from it to be allocated to Content repo.&lt;/P&gt;&lt;P style="margin-left: 20px;"&gt;&lt;/P&gt;&lt;P&gt;HDF Server (RHEL 7) ***128 GB**16 Core
2 x 600GB 15K OS (RAID1)
4 x 600GB 15K FileFlow (RAID10) *** gives us 1.2 TB
6 x 600GB 15K Content (RAID10)  *** gives us 1.8 TB
4 x 600GB 15K Provenance (RAID10) **gives us 1.2 TB&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Sai&lt;/P&gt;</description>
      <pubDate>Wed, 12 Oct 2016 21:03:17 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150997#M40571</guid>
      <dc:creator>saikrishna_tara</dc:creator>
      <dc:date>2016-10-12T21:03:17Z</dc:date>
    </item>
    <item>
      <title>Re: NiFI Server Configuration</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150998#M40572</link>
      <description>&lt;P&gt;&lt;A rel="user" href="https://community.cloudera.com/users/11732/saikrishnatarapareddy.html" nodeid="11732"&gt;@Saikrishna Tarapareddy&lt;/A&gt;&lt;/P&gt;&lt;P&gt;The FlowFile repo will never get close to 1.2 TB in size.  That is a lot of wasted money on hardware.  You should inquire with your vendor about having them split that Raid in to multiple logical volumes, so you can allocate a large portion of it to other things.  Logical Volumes is also a safe way to protect your RAID1 where you OS lives.  If some error condition should occur that results in a lot of logging, the application logs may eat up all your disk space affecting you OS.  With logical volumes you can protect your root disk. If not possible, I would recommend changing you setup to a a bunch of RAID1 setups.&lt;/P&gt;&lt;P&gt;With 16 x 600 GB hard drives you have allocated above, you could create 8 RAID1 disk arrays.  &lt;/P&gt;&lt;P&gt;- 1 for root + software install + database repo + logs  (need to make sure you have some monitioring setup to monitor disk usage on this RAID if logical volumes can not be supported)&lt;/P&gt;&lt;P&gt;- 1 for flowfile repo&lt;/P&gt;&lt;P&gt;- 3 for content repo&lt;/P&gt;&lt;P&gt;- 3 for provenance repo &lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;&lt;P&gt;Matt&lt;/P&gt;</description>
      <pubDate>Wed, 12 Oct 2016 21:24:45 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/NiFI-Server-Configuration/m-p/150998#M40572</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2016-10-12T21:24:45Z</dc:date>
    </item>
  </channel>
</rss>

