Member since
07-30-2019
105
Posts
129
Kudos Received
43
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1609 | 02-27-2018 01:55 PM | |
2034 | 02-27-2018 05:01 AM | |
5520 | 02-27-2018 04:43 AM | |
1792 | 02-27-2018 04:18 AM | |
5803 | 02-27-2018 03:52 AM |
10-02-2025
06:01 AM
In my opinion, the java.lang.OutOfMemoryError: Java heap space that you are experiencing in NiFi may not be due to a built-in memory leak as you say. This could be caused because of the result of the workload exceeding the allocated heap. With such a huge number of processors, 123 and large flowfiles (50 MB × 200), memory demand definitely will grow rapidly, and the 4 GB heap configured in bootstrap.conf may not be sufficient. In general, NiFi is designed to handle large data flows, but of course it requires proper tuning. You can try to decrease the memory usage by increasing the heap size (if hardware allows), adjusting processor concurrency so heavy processors don’t run in parallel, and configuring back pressure to limit queued flowfiles. All these can actually help with reducing memory pressure. Additionally, efficient use of NiFi’s repositories (content, flowfile, provenance) ensures less reliance on heap memory. Following these optimisations can support your NiFi instance to handle the workload more effectively and also to avoid frequent OutOfMemory errors. To know the different types of OutOfMemoryError and how to resolve them, you can refer to this blog: Types of OutOfMemoryError
... View more
07-03-2023
02:51 AM
Hi @rohan_kapadekar I want to store the request and response of Soap api via using jms or Kafka, Can you help me in this or suggest a solution by which I can implement this.
... View more
02-23-2021
02:01 AM
I'm new to NiFi, and I'm not sure your data flow has same condition as mine, but I have a same issue of being occurred the same exception that you mentioned. I'm using Oracle 11g XE, there was no invalid query nor invalid data. In addition, I had another problem with the Oracle session of PutSQL been locked when I let a lot of flowfile flow to PutSQL processor, e.g., 5,000 flowfile in 0.5 sec. I have spent all day long to fix this problem today modifying almost every single properties of all processors connected to the flow, and even of DBCP controller service... and finally found the cause. In processor PutSQL, there is a property named 'Support Fragmented Transactions'. I don't know pretty much about this and need to know how it works, but when I have set it false, the problem was solved. And it took some time more than before. I'm not an expert of NiFi, but I hope this might be helpful for you.
... View more
04-17-2018
02:01 PM
I am working on NIFI-4456 which will allow the JSON reader/writer to support the "one JSON per line" format as well as the "JSON array" format for input and output, so you will be able to read in one JSON per line and output a JSON array, using ConvertRecord (or any other record-aware processor). In the meantime, you can use the following crude script in an ExecuteGroovyScript processor to process your entire file (avoiding the Split/Merge pattern), it should get you what you want: def flowFile = session.get()
if(!flowFile) return
flowFile = session.write(flowFile, {inStream, outStream ->
outStream.write('['.bytes)
inStream.eachLine { line, i ->
if(i > 1) outStream.write(','.bytes)
outStream.write(line.bytes)
}
outStream.write(']'.bytes)
} as StreamCallback)
session.transfer(flowFile, REL_SUCCESS) The script just adds array brackets around the whole doc, and separates the lines by a comma. I did the crude version because it doesn't need to load the entire input content into memory. If you need more control over the JSON objects, you could iterate over the lines (still with eachLine), use JsonSlurper to deserialize each string into a JSON object, then add each object to an array, then use JsonOutput to serialize the whole thing back to a string. However that involves having the entire content in memory and could get unwieldy for large input flow files.
... View more
03-08-2018
08:49 AM
How far is the work on surfacing data provenance data in error handling? I have just discussed this option internally, rolling our own error handling process group using the data provenance rest api for looking up relevant data to convey in error logs and messages, but if on the near horizon as a built-in option, that sounds great.
... View more
11-25-2016
11:57 AM
Hi @mayki wogno The first error message was also written by the same error with the second error message. The processor reported the error twice, because it logged an error message when the ListHDFS processor caught the exception, then re-throw it, and NiFi framework caught the exception and logged another error message. When NiFi framework catches an exception thrown by a processor, it yields the processor for the amount of time specified by 'Yield Duration'. Once the processor successfully accesses core-site.xml and hdfs-site.xml, both error messages will be cleared.
... View more
10-11-2016
02:59 PM
Thanks for sharing your knowledge, I will try your tips. This specific GC issue is happening only when I assign multiple threads to the processors and try to speed up the flow, that otherwise runs at roughly 10MB/s in single thread.
I originally designed the flow to use flowfile attributes cause I was tempted to make the computation happen in memory. I thought that it would have been faster with respect to reading the flowfile content in each processor, and consequently parsing it to get specific fields. Do you suggest trying to implement a version that works, let's say, "on disk" on flowfile content instead of attributes?
... View more
08-02-2019
02:25 PM
@Riccardo Iacomini Thank you for the great post! This is very helpful. Here I am wondering how you batch things together like having many csv rows instead of one csv row. Because if we want to batch csv row into multiple rows, we use MergeContent processor, but you also mention that MergeContent is costly. So how batch processing will work on Nifi??
... View more
09-07-2016
02:25 PM
Hi @jwitt, When I look in the nifi-assembly/target directory, I see archive-tmp (0 Bytes) maven-shared-archive-resources (3.05 GB) nifi-1.0.0-bin (1.58 GB) .plxarc (1KB) nifi-1.0.0-bin.tar.gz (766 MB) nifi-1.0.0-bin.zip (766 MB) When I tried mvn validate on the top level directory, the build came back successful. I'm sorry, I don't know what you mean by grabbing a convenience binary.
... View more
09-08-2016
12:27 AM
Thanks you,I understand. It seems the document need to be updated.
... View more