Created on 06-27-2019 06:35 PM - edited 08-17-2019 02:47 PM
Hello,
I have a problem regarding content repository which fill up until eating all the space available (200GO). I'm using the following flow to ingest data continuously from an OPC Server and publish to a kafka server.
I have changed the nifi configuration as to make the ingestion fast without much latency as following :
# FlowFile Repository nifi.flowfile.repository.implementation=org.apache.nifi.controller.repository.VolatileFlowFileRepository nifi.flowfile.repository.wal.implementation=org.apache.nifi.wali.SequentialAccessWriteAheadLog nifi.flowfile.repository.directory=./flowfile_repository nifi.flowfile.repository.partitions=256 nifi.flowfile.repository.checkpoint.interval=2 mins nifi.flowfile.repository.always.sync=false nifi.swap.manager.implementation=org.apache.nifi.controller.FileSystemSwapManager nifi.queue.swap.threshold=20000 nifi.swap.in.period=5 sec nifi.swap.in.threads=1 nifi.swap.out.period=5 sec nifi.swap.out.threads=4 # Content Repository nifi.content.repository.implementation=org.apache.nifi.controller.repository.FileSystemRepository nifi.content.claim.max.appendable.size=1 MB nifi.content.claim.max.flow.files=10 nifi.content.repository.directory.default=./content_repository nifi.content.repository.archive.max.retention.period=1 hours nifi.content.repository.archive.max.usage.percentage=50% nifi.content.repository.archive.enabled=true nifi.content.repository.always.sync=false nifi.content.viewer.url=../nifi-content-viewer/ # Provenance Repository Properties nifi.provenance.repository.implementation=org.apache.nifi.provenance.VolatileProvenanceRepository nifi.provenance.repository.debug.frequency=1_000_000 nifi.provenance.repository.encryption.key.provider.implementation= nifi.provenance.repository.encryption.key.provider.location= nifi.provenance.repository.encryption.key.id= nifi.provenance.repository.encryption.key= # Persistent Provenance Repository Properties nifi.provenance.repository.directory.default=./provenance_repository nifi.provenance.repository.max.storage.time=2 hours nifi.provenance.repository.max.storage.size=1 GB nifi.provenance.repository.rollover.time=30 secs nifi.provenance.repository.rollover.size=100 MB nifi.provenance.repository.query.threads=2 nifi.provenance.repository.index.threads=2 nifi.provenance.repository.compress.on.rollover=true nifi.provenance.repository.always.sync=false # Comma-separated list of fields. Fields that are not indexed will not be searchable. Valid fields are: # EventType, FlowFileUUID, Filename, TransitURI, ProcessorID, AlternateIdentifierURI, Relationship, Details #nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, ProcessorID, Relationship nifi.provenance.repository.indexed.fields=EventType, FlowFileUUID, Filename, ProcessorID, Relationship, ContentClaimIdentifier # FlowFile Attributes that should be indexed and made searchable. Some examples to consider are filename, uuid, mime.type nifi.provenance.repository.indexed.attributes= # Large values for the shard size will result in more Java heap usage when searching the Provenance Repository # but should provide better performance nifi.provenance.repository.index.shard.size=500 MB # Indicates the maximum length that a FlowFile attribute can be when retrieving a Provenance Event from # the repository. If the length of any attribute exceeds this value, it will be truncated when the event is retrieved. nifi.provenance.repository.max.attribute.length=65536 nifi.provenance.repository.concurrent.merge.threads=2 # Volatile Provenance Respository Properties nifi.provenance.repository.buffer.size=100000 # Component Status Repository nifi.components.status.repository.implementation=org.apache.nifi.controller.status.history.VolatileComponentStatusRepository nifi.components.status.repository.buffer.size=1440 nifi.components.status.snapshot.frequency=1 min
Created 05-27-2020 06:27 AM
Hi, I have the same problem, I wonder if the option nifi.content.repository.always.sync = true would solve the problem?
the sensation that gives me is that it is not synchronized with what is in the content repository directories since when reviewing them there are older files than what is stipulated in the archive configuration
Will there be any method to force synchronization?
Hello
I have the exact same problem
I just installed nifi 1.11.4 on a windows 10 machine, launched nifi with default configuration and created simple flowfiles; what it always happens is that the flowfiles are not archived immediately but upon the next nifi restart.
The flowfiles are no longer in use and the flowfile repository is updated within two minutes after saving the flowfile in the content repository, as expected, but for the file to be archived a restart is necessary
I observed the same behavior with nifi 1.9.2 on a mac, and with nifi 1.9.0 on a cluster of which I am not adminstrator
Is this the default behavior of nifi? I read the documentation and nowhere can I find anything saying nifi must be restarted in order to archive files in the repository
Thanks a lot