Member since
07-30-2019
105
Posts
129
Kudos Received
43
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
1352 | 02-27-2018 01:55 PM | |
1756 | 02-27-2018 05:01 AM | |
4808 | 02-27-2018 04:43 AM | |
1352 | 02-27-2018 04:18 AM | |
4331 | 02-27-2018 03:52 AM |
07-03-2023
02:51 AM
Hi @rohan_kapadekar I want to store the request and response of Soap api via using jms or Kafka, Can you help me in this or suggest a solution by which I can implement this.
... View more
02-23-2021
02:01 AM
I'm new to NiFi, and I'm not sure your data flow has same condition as mine, but I have a same issue of being occurred the same exception that you mentioned. I'm using Oracle 11g XE, there was no invalid query nor invalid data. In addition, I had another problem with the Oracle session of PutSQL been locked when I let a lot of flowfile flow to PutSQL processor, e.g., 5,000 flowfile in 0.5 sec. I have spent all day long to fix this problem today modifying almost every single properties of all processors connected to the flow, and even of DBCP controller service... and finally found the cause. In processor PutSQL, there is a property named 'Support Fragmented Transactions'. I don't know pretty much about this and need to know how it works, but when I have set it false, the problem was solved. And it took some time more than before. I'm not an expert of NiFi, but I hope this might be helpful for you.
... View more
04-17-2018
02:01 PM
I am working on NIFI-4456 which will allow the JSON reader/writer to support the "one JSON per line" format as well as the "JSON array" format for input and output, so you will be able to read in one JSON per line and output a JSON array, using ConvertRecord (or any other record-aware processor). In the meantime, you can use the following crude script in an ExecuteGroovyScript processor to process your entire file (avoiding the Split/Merge pattern), it should get you what you want: def flowFile = session.get()
if(!flowFile) return
flowFile = session.write(flowFile, {inStream, outStream ->
outStream.write('['.bytes)
inStream.eachLine { line, i ->
if(i > 1) outStream.write(','.bytes)
outStream.write(line.bytes)
}
outStream.write(']'.bytes)
} as StreamCallback)
session.transfer(flowFile, REL_SUCCESS) The script just adds array brackets around the whole doc, and separates the lines by a comma. I did the crude version because it doesn't need to load the entire input content into memory. If you need more control over the JSON objects, you could iterate over the lines (still with eachLine), use JsonSlurper to deserialize each string into a JSON object, then add each object to an array, then use JsonOutput to serialize the whole thing back to a string. However that involves having the entire content in memory and could get unwieldy for large input flow files.
... View more
03-08-2018
08:49 AM
How far is the work on surfacing data provenance data in error handling? I have just discussed this option internally, rolling our own error handling process group using the data provenance rest api for looking up relevant data to convey in error logs and messages, but if on the near horizon as a built-in option, that sounds great.
... View more
11-25-2016
11:57 AM
Hi @mayki wogno The first error message was also written by the same error with the second error message. The processor reported the error twice, because it logged an error message when the ListHDFS processor caught the exception, then re-throw it, and NiFi framework caught the exception and logged another error message. When NiFi framework catches an exception thrown by a processor, it yields the processor for the amount of time specified by 'Yield Duration'. Once the processor successfully accesses core-site.xml and hdfs-site.xml, both error messages will be cleared.
... View more
10-11-2016
02:59 PM
Thanks for sharing your knowledge, I will try your tips. This specific GC issue is happening only when I assign multiple threads to the processors and try to speed up the flow, that otherwise runs at roughly 10MB/s in single thread.
I originally designed the flow to use flowfile attributes cause I was tempted to make the computation happen in memory. I thought that it would have been faster with respect to reading the flowfile content in each processor, and consequently parsing it to get specific fields. Do you suggest trying to implement a version that works, let's say, "on disk" on flowfile content instead of attributes?
... View more
08-02-2019
02:25 PM
@Riccardo Iacomini Thank you for the great post! This is very helpful. Here I am wondering how you batch things together like having many csv rows instead of one csv row. Because if we want to batch csv row into multiple rows, we use MergeContent processor, but you also mention that MergeContent is costly. So how batch processing will work on Nifi??
... View more
09-07-2016
02:25 PM
Hi @jwitt, When I look in the nifi-assembly/target directory, I see archive-tmp (0 Bytes) maven-shared-archive-resources (3.05 GB) nifi-1.0.0-bin (1.58 GB) .plxarc (1KB) nifi-1.0.0-bin.tar.gz (766 MB) nifi-1.0.0-bin.zip (766 MB) When I tried mvn validate on the top level directory, the build came back successful. I'm sorry, I don't know what you mean by grabbing a convenience binary.
... View more
09-08-2016
12:27 AM
Thanks you,I understand. It seems the document need to be updated.
... View more
09-08-2016
03:59 AM
Yep what you describe with UpdateAttribute/MergeContent sounds perfectly fine. What you'll want there precisely will depend on how many relationships you have out of RouteText. As for concurrent tasks I'd say it would be 1 for GetFile 1 for SplitFile 2...4 or 5 or so on RouteText. No need to go too high generally. 1 for MergeContent 1 to 2 for PutHDFS You don't have to stress too much on those numbers out of the gate. You can run it with minimal threads first, find any bottlenecks and increase if necessary.
... View more