<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question NiFi: content repository errors after one of the cluster nodes fails in Support Questions</title>
    <link>https://community.cloudera.com/t5/Support-Questions/NiFi-content-repository-errors-after-one-of-the-cluster/m-p/352494#M236522</link>
    <description>&lt;P&gt;Hi everyone.&lt;/P&gt;&lt;P&gt;We have an NiFi cluster consisting of 3 nodes. After the failure of the disk subsystem on one of the nodes, it was in the ReadOnly state for a long time. After resolving the issue and after restarting cluster we are getting the following error on problematic node:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;2022-09-16 15:31:07,255 ERROR [Load-Balanced Client Thread-4] o.a.n.c.q.c.c.a.n.NioAsyncLoadBalanceClient Failed to communicate with Peer xxxxxx:9443
java.io.EOFException: Expected StandardFlowFileRecord[uuid=6ce9e262-b20b-4372-a3b9-43c2c00e8caa,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1663256223724-231072817, container=default, section=49], offset=387990, length=2203],offset=0,name=04190e1f-fdca-4352-a796-6b6c9ce41baa,size=2203] to contain 2203 bytes but the content repository only had 1130 bytes for it
	at org.apache.nifi.controller.queue.clustered.ContentRepositoryFlowFileAccess$1.ensureNotTruncated(ContentRepositoryFlowFileAccess.java:83)
	at org.apache.nifi.controller.queue.clustered.ContentRepositoryFlowFileAccess$1.read(ContentRepositoryFlowFileAccess.java:63)
	at org.apache.nifi.stream.io.StreamUtils.fillBuffer(StreamUtils.java:89)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.LoadBalanceSession.getFlowFileContent(LoadBalanceSession.java:297)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.LoadBalanceSession.getDataFrame(LoadBalanceSession.java:252)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.LoadBalanceSession.communicate(LoadBalanceSession.java:162)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.NioAsyncLoadBalanceClient.communicate(NioAsyncLoadBalanceClient.java:242)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.NioAsyncLoadBalanceClientTask.run(NioAsyncLoadBalanceClientTask.java:76)
	at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)&lt;/LI-CODE&gt;&lt;P&gt;Content repository settings:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;# Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.claim.max.appendable.size=1 MB
nifi.content.claim.max.flow.files=100
nifi.content.repository.directory.default=./content_repository
nifi.content.repository.archive.max.retention.period=1 hours
nifi.content.repository.archive.max.usage.percentage=75%
nifi.content.repository.archive.enabled=true
nifi.content.repository.always.sync=false
nifi.content.viewer.url=../nifi-content-viewer/
nifi.content.repository.encryption.key.provider.implementation=
nifi.content.repository.encryption.key.provider.location=
nifi.content.repository.encryption.key.id=
nifi.content.repository.encryption.key=&lt;/LI-CODE&gt;&lt;P&gt;Maybe someone have any ideas how to handle it?&lt;/P&gt;</description>
    <pubDate>Fri, 16 Sep 2022 15:35:27 GMT</pubDate>
    <dc:creator>EuGras</dc:creator>
    <dc:date>2022-09-16T15:35:27Z</dc:date>
    <item>
      <title>NiFi: content repository errors after one of the cluster nodes fails</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-content-repository-errors-after-one-of-the-cluster/m-p/352494#M236522</link>
      <description>&lt;P&gt;Hi everyone.&lt;/P&gt;&lt;P&gt;We have an NiFi cluster consisting of 3 nodes. After the failure of the disk subsystem on one of the nodes, it was in the ReadOnly state for a long time. After resolving the issue and after restarting cluster we are getting the following error on problematic node:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;2022-09-16 15:31:07,255 ERROR [Load-Balanced Client Thread-4] o.a.n.c.q.c.c.a.n.NioAsyncLoadBalanceClient Failed to communicate with Peer xxxxxx:9443
java.io.EOFException: Expected StandardFlowFileRecord[uuid=6ce9e262-b20b-4372-a3b9-43c2c00e8caa,claim=StandardContentClaim [resourceClaim=StandardResourceClaim[id=1663256223724-231072817, container=default, section=49], offset=387990, length=2203],offset=0,name=04190e1f-fdca-4352-a796-6b6c9ce41baa,size=2203] to contain 2203 bytes but the content repository only had 1130 bytes for it
	at org.apache.nifi.controller.queue.clustered.ContentRepositoryFlowFileAccess$1.ensureNotTruncated(ContentRepositoryFlowFileAccess.java:83)
	at org.apache.nifi.controller.queue.clustered.ContentRepositoryFlowFileAccess$1.read(ContentRepositoryFlowFileAccess.java:63)
	at org.apache.nifi.stream.io.StreamUtils.fillBuffer(StreamUtils.java:89)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.LoadBalanceSession.getFlowFileContent(LoadBalanceSession.java:297)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.LoadBalanceSession.getDataFrame(LoadBalanceSession.java:252)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.LoadBalanceSession.communicate(LoadBalanceSession.java:162)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.NioAsyncLoadBalanceClient.communicate(NioAsyncLoadBalanceClient.java:242)
	at org.apache.nifi.controller.queue.clustered.client.async.nio.NioAsyncLoadBalanceClientTask.run(NioAsyncLoadBalanceClientTask.java:76)
	at org.apache.nifi.engine.FlowEngine$2.run(FlowEngine.java:110)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)&lt;/LI-CODE&gt;&lt;P&gt;Content repository settings:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;# Content Repository
nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository
nifi.content.claim.max.appendable.size=1 MB
nifi.content.claim.max.flow.files=100
nifi.content.repository.directory.default=./content_repository
nifi.content.repository.archive.max.retention.period=1 hours
nifi.content.repository.archive.max.usage.percentage=75%
nifi.content.repository.archive.enabled=true
nifi.content.repository.always.sync=false
nifi.content.viewer.url=../nifi-content-viewer/
nifi.content.repository.encryption.key.provider.implementation=
nifi.content.repository.encryption.key.provider.location=
nifi.content.repository.encryption.key.id=
nifi.content.repository.encryption.key=&lt;/LI-CODE&gt;&lt;P&gt;Maybe someone have any ideas how to handle it?&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 15:35:27 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-content-repository-errors-after-one-of-the-cluster/m-p/352494#M236522</guid>
      <dc:creator>EuGras</dc:creator>
      <dc:date>2022-09-16T15:35:27Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi: content repository errors after one of the cluster nodes fails</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-content-repository-errors-after-one-of-the-cluster/m-p/352503#M236526</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/100517"&gt;@EuGras&lt;/a&gt;&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;You have a FlowFile queued somewhere within your dataflow with UUID=&amp;nbsp;&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;6ce9e262-b20b-4372-a3b9-43c2c00e8caa&lt;/LI-CODE&gt;&lt;P&gt;The connection is trying to read the content for that FlowFile from a content claim found in the content-repository in order to load balance data across nodes in the cluster here:&lt;/P&gt;&lt;LI-CODE lang="markup"&gt;id=1663256223724-231072817, container=default, section=49&lt;/LI-CODE&gt;&lt;P&gt;&amp;lt;path to&amp;gt;/content_repository/49/1663256223724-231072817&lt;BR /&gt;&lt;BR /&gt;The FlowFile metadata/attributes has recorded that this content should be 2203 bytes in length; however, tis file is only 1130 bytes in size.&amp;nbsp; So it appears when you had disk issue it resulted in data corruption.&amp;nbsp;&lt;BR /&gt;&lt;BR /&gt;You could use NiFi data provenance to locate this FlowFile by UUID or filename (04190e1f-fdca-4352-a796-6b6c9ce41baa) to determine which connection contains it.&amp;nbsp; &amp;nbsp;On that connection you could disable load-balance connection configuration, add a routeOnAttribute processor to filter out this one bad FlowFile and auto-terminate it once it is routed out of other FlowFiles that may have been queued in that same connection.&lt;BR /&gt;&lt;BR /&gt;This is not to say that you may have other corruption caused by your disk issues besides this one FlowFile.&amp;nbsp; &amp;nbsp;If you do not care about the data on the nodes that had the disk issues, as another option, you could shutdown that one node, purge the contents of the flowfile_repository and content_repository.&amp;nbsp; This will effectively delete all flowfiles queued in connections on that one node.&amp;nbsp; Then restart the NiFi node.&amp;nbsp; It will construct new content and flowfile repository on startup.&lt;BR /&gt;&lt;BR /&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="batang,apple gothic"&gt;If you found that the provided solution(s) assisted you with your query, please take a moment to login and click&lt;/FONT&gt;&amp;nbsp;&lt;FONT face="arial black,avant garde" color="#FF0000"&gt;Accept as Solution&amp;nbsp;&lt;/FONT&gt;&lt;FONT face="batang,apple gothic" color="#000000"&gt;below each response that helped.&lt;BR /&gt;&lt;BR /&gt;Thank you,&lt;/FONT&gt;&lt;/P&gt;&lt;P&gt;&lt;FONT face="batang,apple gothic" color="#000000"&gt;Matt&lt;/FONT&gt;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 19:49:51 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-content-repository-errors-after-one-of-the-cluster/m-p/352503#M236526</guid>
      <dc:creator>MattWho</dc:creator>
      <dc:date>2022-09-16T19:49:51Z</dc:date>
    </item>
    <item>
      <title>Re: NiFi: content repository errors after one of the cluster nodes fails</title>
      <link>https://community.cloudera.com/t5/Support-Questions/NiFi-content-repository-errors-after-one-of-the-cluster/m-p/352702#M236579</link>
      <description>&lt;P&gt;&lt;a href="https://community.cloudera.com/t5/user/viewprofilepage/user-id/35454"&gt;@MattWho&lt;/a&gt;&amp;nbsp;thanks a lot.&amp;nbsp;&lt;SPAN&gt;I identified connections with problematic files, disabled load-balance and terminated them according to your method via filtering by id. It's interesting that the problematic connection ID is not showed in the nifi-app.log, but in the UI logs shows&lt;/SPAN&gt;&lt;/P&gt;</description>
      <pubDate>Tue, 20 Sep 2022 09:57:01 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Support-Questions/NiFi-content-repository-errors-after-one-of-the-cluster/m-p/352702#M236579</guid>
      <dc:creator>EuGras</dc:creator>
      <dc:date>2022-09-20T09:57:01Z</dc:date>
    </item>
  </channel>
</rss>

