Member since
05-08-2022
2
Posts
0
Kudos Received
0
Solutions
05-17-2022
05:24 PM
Many Thanks. I want to confirm one more thing, it seems each content-accessing processor needs to read content from disk, even when two of such processors are directly connected to each other. E.g. I have a ConsumeKafkaRecord_2_0 leads to a PutElasticsearchHttpRecord, the former writes to disk, while the latter reads from disk. However, if the content can be cached in memory (and meanwhile synced in disk), there would be one disk IO saved, so is there any configure properties to make content cached in memory? If there were such option, It should improve the overall throughput, otherwise, it seems better to merge all content-accessing processor into one single processor to save disk IO, correct?
... View more
05-08-2022
08:36 PM
In Apache Nifi, there are connections between each processors, which acts like queue of FlowFiles, and Nifi by default persists data content of FlowFile on disk. Does it mean each of such connection persists FlowFiles on disk? If that were true, each time of delivery of FlowFiles from one processor to another would mean one disk read and write, thus more processors would lead to more disk reads and writes, which in turn would lower the entire throughput. Is my understanding correct? and what is the best practice to avoid it, writing all things in one processor? Thanks.
... View more
Labels:
- Labels:
-
Apache NiFi