<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: Nifi cluster production configuration in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/Nifi-cluster-production-configuration/m-p/361190#M238529</link>
    <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/102752"&gt;@SachinMehndirat&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;There is NO replication of data from the four NiFi repositories across all NiFi nodes in a NiFi cluster.&amp;nbsp; Each NiFi node in the cluster is only aware of and only excutes against the FlowFile on that specific node.&lt;BR /&gt;&lt;BR /&gt;As such, NiFi nodes can not share a common set of repositories.&amp;nbsp; Each node must have their own repositories and it is important to protect those repositories from data loss (flowfile_repository and content_repository being most important).&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;- &lt;STRONG&gt;flowfile_repository&lt;/STRONG&gt; - contain metadata/attributes about FlowFiles actively processing thorugh your NiFi dataflow(s). This includes metadata on location of content of queued FlowFiles.&lt;/P&gt;&lt;P&gt;- &lt;STRONG&gt;content_repository&lt;/STRONG&gt; - contains content claims that can hold the content for 1 too many FlowFiles actively being processed or temporarily archived post termination at end of dataflow(s)&lt;BR /&gt;- &lt;STRONG&gt;provenance_repository&lt;/STRONG&gt; - contains historical lineage information about FlowFile currently or previously processed through your NiFi dataflows.&lt;/P&gt;&lt;P&gt;- &lt;STRONG&gt;database_repository&lt;/STRONG&gt; - contains flow configuration history which is a record of changes made via NiFi UI (adding, modifying, deleting, stopping, starting, etc...).&amp;nbsp; Also contain info about users currently authenticated in to the NiFi node.&lt;BR /&gt;&lt;BR /&gt;Processors that record cluster wide state would use zookeeper to store and retrieve that stored state needed by all nodes.&amp;nbsp; Processors that use local state will write that state to NiFi locally configured state directory.&amp;nbsp; So in addition to protect the repositories mentioned above from dataloss, you'll also want to make sure local state (unique to each node in the NiFi cluster) directory is also protected.&lt;BR /&gt;The embedded documentation in NiFi for each component has a section "&lt;STRONG&gt;State management:&lt;/STRONG&gt;" that will tell you if that component use local and/or cluster state.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;You may find some of the info found in the following articles useful:&lt;BR /&gt;&lt;A href="https://community.cloudera.com/t5/Community-Articles/HDF-CFM-NIFI-Best-practices-for-setting-up-a-high/ta-p/244999" target="_blank"&gt;https://community.cloudera.com/t5/Community-Articles/HDF-CFM-NIFI-Best-practices-for-setting-up-a-high/ta-p/244999&lt;/A&gt;&lt;BR /&gt;&lt;A href="https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Archiving-works/ta-p/249418" target="_blank"&gt;https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Archiving-works/ta-p/249418&lt;/A&gt;&lt;BR /&gt;&lt;A href="https://blogs.apache.org/nifi/entry/load-balancing-across-the-cluster" target="_blank"&gt;https://blogs.apache.org/nifi/entry/load-balancing-across-the-cluster&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="batang,apple gothic"&gt;If you found that the provided solution(s) assisted you with your query, please take a moment to login and click&lt;/FONT&gt;&amp;nbsp;&lt;FONT face="arial black,avant garde" color="#FF0000"&gt;Accept as Solution&amp;nbsp;&lt;/FONT&gt;&lt;FONT face="batang,apple gothic" color="#000000"&gt;below each response that helped.&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="batang,apple gothic" color="#000000"&gt;Matt&lt;/FONT&gt;&lt;/P&gt;</description>
    <pubDate>Thu, 12 Jan 2023 21:30:22 GMT</pubDate>
    <dc:creator>MattWho</dc:creator>
    <dc:date>2023-01-12T21:30:22Z</dc:date>
    <item>
      <title>Nifi cluster production configuration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-cluster-production-configuration/m-p/361134#M238519</link>
      <description>&lt;P&gt;Folks,&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I have setup the secured nifi cluster on development env and there are few things hitting my mind wrt to production use.&lt;/P&gt;&lt;P&gt;&amp;nbsp;- There are few configurations related to database, flow repository, content repository, provenance, components etc and I'm wondering what should be the best practices to manage these files. Should I use the persistence volume/storage on K8 to have these centralized for whole cluster.&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; If yes, would it interfere with internal replication? Wouldn't it be SPOF?&lt;/P&gt;&lt;P&gt;&amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; &amp;nbsp; If not, what if my cluster goes down, I will loose all the state and data or shall I use a replica set explicitly?&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Can someone helps understanding the best practices on production wrt above scenarios and also anything in general&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Thanks,&lt;/P&gt;</description>
      <pubDate>Thu, 12 Jan 2023 11:51:09 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-cluster-production-configuration/m-p/361134#M238519</guid>
      <dc:creator>SachinMehndirat</dc:creator>
      <dc:date>2023-01-12T11:51:09Z</dc:date>
    </item>
    <item>
      <title>Re: Nifi cluster production configuration</title>
      <link>https://community.cloudera.com/t5/Support-Questions/Nifi-cluster-production-configuration/m-p/361190#M238529</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/102752"&gt;@SachinMehndirat&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;There is NO replication of data from the four NiFi repositories across all NiFi nodes in a NiFi cluster.&amp;nbsp; Each NiFi node in the cluster is only aware of and only excutes against the FlowFile on that specific node.&lt;BR /&gt;&lt;BR /&gt;As such, NiFi nodes can not share a common set of repositories.&amp;nbsp; Each node must have their own repositories and it is important to protect those repositories from data loss (flowfile_repository and content_repository being most important).&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;- &lt;STRONG&gt;flowfile_repository&lt;/STRONG&gt; - contain metadata/attributes about FlowFiles actively processing thorugh your NiFi dataflow(s). This includes metadata on location of content of queued FlowFiles.&lt;/P&gt;&lt;P&gt;- &lt;STRONG&gt;content_repository&lt;/STRONG&gt; - contains content claims that can hold the content for 1 too many FlowFiles actively being processed or temporarily archived post termination at end of dataflow(s)&lt;BR /&gt;- &lt;STRONG&gt;provenance_repository&lt;/STRONG&gt; - contains historical lineage information about FlowFile currently or previously processed through your NiFi dataflows.&lt;/P&gt;&lt;P&gt;- &lt;STRONG&gt;database_repository&lt;/STRONG&gt; - contains flow configuration history which is a record of changes made via NiFi UI (adding, modifying, deleting, stopping, starting, etc...).&amp;nbsp; Also contain info about users currently authenticated in to the NiFi node.&lt;BR /&gt;&lt;BR /&gt;Processors that record cluster wide state would use zookeeper to store and retrieve that stored state needed by all nodes.&amp;nbsp; Processors that use local state will write that state to NiFi locally configured state directory.&amp;nbsp; So in addition to protect the repositories mentioned above from dataloss, you'll also want to make sure local state (unique to each node in the NiFi cluster) directory is also protected.&lt;BR /&gt;The embedded documentation in NiFi for each component has a section "&lt;STRONG&gt;State management:&lt;/STRONG&gt;" that will tell you if that component use local and/or cluster state.&lt;/P&gt;&lt;P&gt;&lt;BR /&gt;You may find some of the info found in the following articles useful:&lt;BR /&gt;&lt;A href="https://community.cloudera.com/t5/Community-Articles/HDF-CFM-NIFI-Best-practices-for-setting-up-a-high/ta-p/244999" target="_blank"&gt;https://community.cloudera.com/t5/Community-Articles/HDF-CFM-NIFI-Best-practices-for-setting-up-a-high/ta-p/244999&lt;/A&gt;&lt;BR /&gt;&lt;A href="https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Archiving-works/ta-p/249418" target="_blank"&gt;https://community.cloudera.com/t5/Community-Articles/Understanding-how-NiFi-s-Content-Repository-Archiving-works/ta-p/249418&lt;/A&gt;&lt;BR /&gt;&lt;A href="https://blogs.apache.org/nifi/entry/load-balancing-across-the-cluster" target="_blank"&gt;https://blogs.apache.org/nifi/entry/load-balancing-across-the-cluster&lt;/A&gt;&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="batang,apple gothic"&gt;If you found that the provided solution(s) assisted you with your query, please take a moment to login and click&lt;/FONT&gt;&amp;nbsp;&lt;FONT face="arial black,avant garde" color="#FF0000"&gt;Accept as Solution&amp;nbsp;&lt;/FONT&gt;&lt;FONT face="batang,apple gothic" color="#000000"&gt;below each response that helped.&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="batang,apple gothic" color="#000000"&gt;Matt&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Thu, 12 Jan 2023 21:30:22 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/Nifi-cluster-production-configuration/m-p/361190#M238529</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2023-01-12T21:30:22Z</dc:date>
    </item>
  </channel>
</rss>

