Member since
04-03-2023
17
Posts
2
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
167 | 03-31-2025 08:46 AM | |
1864 | 07-31-2023 05:34 AM |
05-13-2025
09:47 AM
This is on an isolated system, so screenshots are, well, not impossible, but very difficult. PostHTTP (if not listed, it has "No value set"): URL = our url/contentListener Max Batch Size = 100 MB SSL Context Service = StandardRestrictedSSLContextService Send a FlowFile = true Compression Level = 0 Connection Timeout = 360 sec Data Timeout = 360 sec User Agent = Apache-HttpClient/4.5.5 (Java/1.8.0_262) Content-Type = ${mime.type} Disable Connection Reuse = true ListenHTTP: Base Path = contentListener Listening Port = 9443 SSL Context Service = StandardRestrictedSSLContextService HTTP Protocols = h2 http/1.1 Client Authentication = WANT Authorized Subject DN Pattern = .* Authorized Issuer DN Pattern = .* Max Unconfirmed Flowfile Time = 300 secs HTTP Headers to receive as Attributes (Regex) = .* Return Code = 200 Multipart Request Max Size = 15GB Multipart Read Buffer Size = 1MB Maximum Threadpool Size = 400 Request Time Out = 30 secs This particular ListenHTTP receives from both PostHTTP processors in the same instance, in other instances we own, and from many external sources. Yes, it gets hammered all day long. Under normal circumstances, there is no back pressure on the ListenHTTP processor. It only occurs if there is some bottleneck downstream because of some condition not handled gracefully that might cause queues to back up, but that has occurred very rarely. Thanks for your input!
... View more
05-13-2025
04:43 AM
We have a number of PostHTTP processors sending files to ListenHTTP processors that chug along day after day happy as a clam. Then all of a sudden, there will be a spate of these errors: "Cannot send . . . to . . . because the destination does not accept FlowFiles and this processor is configured to delivery FlowFiles; routing to failure." For one of the PostHTTP processors, I created additional routes on success/failure so that I could look at and compare the files. These files are JSON, and after downloading the files to both success and failure, both are correctly formed JSON. We have a retry-loop flow here, too, that on failure retries the post five times. So these additional routes also allowed me to see in provenance one of the files that failed with the above error eventually went through. I'm looking for any insight on why this might be happening. About = niagarafiles 11.31.0 based on Apache NiFi 1.26.0a
... View more
Labels:
- Labels:
-
Apache NiFi
03-31-2025
08:46 AM
I doubled Xmx to 16384 and the UI came up and rather large backlog of files has begun to flow.
... View more
03-31-2025
08:15 AM
Looking at the log some more, I should add, the initial "error" includes "GC overhead limit exceeded", which I have learned GC = garbage collection.
... View more
03-31-2025
07:30 AM
That pretty much sums it up. We had a database server go down over the weekend, and I have a suspicion that things got backed up, but I can't get to the UI to try to unravel it. I have tried setting autoResumeState to false and setting everything in flow.xml.gz to "Disabled", but the UI still won't come up because of this error. Unfortunately, all of this is in an isolated environment, so sharing full logs is not impossible but very, very difficult. This vm has 23GB of RAM and Xms=4096m and Xmx=8192m. niagarafiles-community-11.6.3
... View more
Labels:
- Labels:
-
Apache NiFi
01-30-2024
03:24 AM
1 Kudo
UPDATE: I'm working on an enclave, so this initial test was at jolt-demo.appspot.com, but moving it over to NiFi, I had to add one addition level in the JOLT transformation. What appears below is now correct. That was quick. This JOLT transformation . . . [
{
"operation": "shift",
"spec": {
"*": {
"*": {
"*": {
"*":
{
"@": ""
}
}
}
}
}
}
] . . . transform the JSON to this . . . [ {
"category" : "reference",
"author" : "Nigel Rees",
"title" : "Sayings of the Century",
"price" : 8.95
}, {
"category" : "fiction",
"author" : "Herman Melville",
"title" : "Moby **bleep**",
"isbn" : "0-553-21311-3",
"price" : 8.99
}, {
"category" : "fiction",
"author" : "J.R.R. Tolkien",
"title" : "The Lord of the Rings",
"isbn" : "0-395-19395-8",
"price" : 22.99
} ] And with an all-defaults JSONTreeReader and CSVRecordSetWriter, "select category" returns exactly what I need. I was thinking about JOLT, but haven't done much with it, and was fearful of the complexity. So thanks again, @SAMSAL for pushing me in the right direction.
... View more
01-30-2024
02:50 AM
1 Kudo
Wow, I was excited when I saw this as it looked like the kind of simple elegance I was looking for, and I wondered why I hadn't noticed the Starting Field Strategy property, because in trying to work this out, I had previously turned to JsonTreeReader. But in implementing it, I see why. We're on version 11.9.0, and the JsonTreeReader is version 1.14.0.i, meaning, I don't have those capabilities. Because moving to a more recent version is not possible, I will go down the JOLT pathway and see what I can work out. Even though I couldn't test it out, I will accept your solution because I believe if I had the latest and greatest, it would be the one. Plus, your description of the JSON hierarchy in play here was helpful.
... View more
01-29-2024
07:24 AM
I have this JSON: {
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Herman Melville",
"title": "Moby **bleep**",
"isbn": "0-553-21311-3",
"price": 8.99
},
{
"category": "fiction",
"author": "J.R.R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
]
} I want a list/record set of categories. CSV, plain text, doesn't matter, but let's say CSV: category
"refereces"
"fiction"
"fiction" I have tried many things, too many to repeat them all here. But basically, I have a GenerateFlowFile where the JSON is hard-coded, then a QueryReport where the reader is a JsonPathReader where I have properties for all fields in the JSON: store $.store
book $.store.book[*]
category $.store.book[*].category
etc. Just to see what's being returned, I currently have the writer set to an all-defaults JsonRecordSetWriter. With this in mind, in the QueryRecord, select * returns the JSON unaltered. select store returns the JSON unaltered. select book returns "no column named 'book'". I can use an EvaluateJsonPath with $.store.book[*].category as the property value, and it returns this: ["references", "fiction", "fiction"] If I switch over to an all-defaults CSVRecordSetWriter and do select store, I get this: store
MapRecord[{book=[Ljava.lang.Object;@24bda2f0}] I know there are other ways to configure EvaluateJsonPath so it does parse the data correctly, but in doing so, it creates a FlowFile for each record. I don't want that; I want a single recordset in one FlowFile because this is just a proof of concept. With the real data I'm looking at tens of thousands of records. I also know I could take this to Groovy and get it done. I'd like to avoid that and only use to the bare minimum of native NiFi processors. I've also tried some things with a ForkRecord, but as I said, I've kind of lost the bubble on everything I've tried. I believe this is possible but running out of energy and ideas and think I've exhausted the wisdom of the web. Is it really this difficult? Let me know what I'm doing wrong.
... View more
Labels:
- Labels:
-
Apache NiFi
11-03-2023
02:45 AM
Thanks, Matt! We've been pressing hard for a year now to get some migration work done from an outmoded ETL tool, and as I've moved along, there's a lot I haven't stopped to truly understand. I had seen the notice about variable registry only before, but didn't truly appreciate what that meant. Now I do! And btw, I solved the problem by calling the "udpate" API directly from an InvokeHTTP processor where there's no restriction on using attributes. Works like a charm!
... View more
11-01-2023
10:15 AM
I moved a working flow that is populating Solr indexes from one process group to another. In the original, the SolrLocation property of the PutSolrContentStream processor is populated using two parameters: #{solr_url}/#{solr_index_name_a} It's done this way because a QueryRecord processor is used to split the record set into two groups, and one path uses the "a" index, and the other, the "b" index. However, in the new flow, I have to append a year value (i.e., "2023") to the name of the index depending on earlier processing. To accomplish this, I am holding the name of the index in an attribute instead of a parameter. At the appropriate time in the flow, I use an UpdateAttribute processor to append the correct year to the index name. Then further down, I have the PutSolrContentStream processor, and I populate the SolrLocation property like this: #{solr_url}/${solr_index_name_a} This fails with an "HTTP ERROR 404 NOT FOUND". It took a lot of trial an error, but I have discovered if I hard code the index name to a parameter, and set the SolrLocation name using two parameters (as in the original flow) instead of a parameter and an attribute like this: #{solr_url}/#{solr_index_name} it works. I move back to the attribute, and I get 404. In testing, I inserted in the middle an UpdateAttribute where I create an attribute called coreURL and set it to the value of the parameter + attribute, and I use that attribute instead as the SolrLocation. No dice. I then copy and paste the value of coreURL into SolrLocation (i.e., a hard-coded URL), and it works. It looks to me that, despite the documentation saying SolrLocation supports Expression Language, it doesn't, because, I've tried many variations, and any time I introduce an attribute to SolrLocation, the processor fails. with a 404 Version is 11.6.3.
... View more
Labels:
- Labels:
-
Apache NiFi
-
Apache Solr