Member since
04-03-2023
15
Posts
2
Kudos Received
2
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
158 | 03-31-2025 08:46 AM | |
1853 | 07-31-2023 05:34 AM |
03-31-2025
08:46 AM
I doubled Xmx to 16384 and the UI came up and rather large backlog of files has begun to flow.
... View more
03-31-2025
08:15 AM
Looking at the log some more, I should add, the initial "error" includes "GC overhead limit exceeded", which I have learned GC = garbage collection.
... View more
03-31-2025
07:30 AM
That pretty much sums it up. We had a database server go down over the weekend, and I have a suspicion that things got backed up, but I can't get to the UI to try to unravel it. I have tried setting autoResumeState to false and setting everything in flow.xml.gz to "Disabled", but the UI still won't come up because of this error. Unfortunately, all of this is in an isolated environment, so sharing full logs is not impossible but very, very difficult. This vm has 23GB of RAM and Xms=4096m and Xmx=8192m. niagarafiles-community-11.6.3
... View more
Labels:
- Labels:
-
Apache NiFi
01-30-2024
03:24 AM
1 Kudo
UPDATE: I'm working on an enclave, so this initial test was at jolt-demo.appspot.com, but moving it over to NiFi, I had to add one addition level in the JOLT transformation. What appears below is now correct. That was quick. This JOLT transformation . . . [
{
"operation": "shift",
"spec": {
"*": {
"*": {
"*": {
"*":
{
"@": ""
}
}
}
}
}
}
] . . . transform the JSON to this . . . [ {
"category" : "reference",
"author" : "Nigel Rees",
"title" : "Sayings of the Century",
"price" : 8.95
}, {
"category" : "fiction",
"author" : "Herman Melville",
"title" : "Moby **bleep**",
"isbn" : "0-553-21311-3",
"price" : 8.99
}, {
"category" : "fiction",
"author" : "J.R.R. Tolkien",
"title" : "The Lord of the Rings",
"isbn" : "0-395-19395-8",
"price" : 22.99
} ] And with an all-defaults JSONTreeReader and CSVRecordSetWriter, "select category" returns exactly what I need. I was thinking about JOLT, but haven't done much with it, and was fearful of the complexity. So thanks again, @SAMSAL for pushing me in the right direction.
... View more
01-30-2024
02:50 AM
1 Kudo
Wow, I was excited when I saw this as it looked like the kind of simple elegance I was looking for, and I wondered why I hadn't noticed the Starting Field Strategy property, because in trying to work this out, I had previously turned to JsonTreeReader. But in implementing it, I see why. We're on version 11.9.0, and the JsonTreeReader is version 1.14.0.i, meaning, I don't have those capabilities. Because moving to a more recent version is not possible, I will go down the JOLT pathway and see what I can work out. Even though I couldn't test it out, I will accept your solution because I believe if I had the latest and greatest, it would be the one. Plus, your description of the JSON hierarchy in play here was helpful.
... View more
01-29-2024
07:24 AM
I have this JSON: {
"store": {
"book": [
{
"category": "reference",
"author": "Nigel Rees",
"title": "Sayings of the Century",
"price": 8.95
},
{
"category": "fiction",
"author": "Herman Melville",
"title": "Moby **bleep**",
"isbn": "0-553-21311-3",
"price": 8.99
},
{
"category": "fiction",
"author": "J.R.R. Tolkien",
"title": "The Lord of the Rings",
"isbn": "0-395-19395-8",
"price": 22.99
}
]
} I want a list/record set of categories. CSV, plain text, doesn't matter, but let's say CSV: category
"refereces"
"fiction"
"fiction" I have tried many things, too many to repeat them all here. But basically, I have a GenerateFlowFile where the JSON is hard-coded, then a QueryReport where the reader is a JsonPathReader where I have properties for all fields in the JSON: store $.store
book $.store.book[*]
category $.store.book[*].category
etc. Just to see what's being returned, I currently have the writer set to an all-defaults JsonRecordSetWriter. With this in mind, in the QueryRecord, select * returns the JSON unaltered. select store returns the JSON unaltered. select book returns "no column named 'book'". I can use an EvaluateJsonPath with $.store.book[*].category as the property value, and it returns this: ["references", "fiction", "fiction"] If I switch over to an all-defaults CSVRecordSetWriter and do select store, I get this: store
MapRecord[{book=[Ljava.lang.Object;@24bda2f0}] I know there are other ways to configure EvaluateJsonPath so it does parse the data correctly, but in doing so, it creates a FlowFile for each record. I don't want that; I want a single recordset in one FlowFile because this is just a proof of concept. With the real data I'm looking at tens of thousands of records. I also know I could take this to Groovy and get it done. I'd like to avoid that and only use to the bare minimum of native NiFi processors. I've also tried some things with a ForkRecord, but as I said, I've kind of lost the bubble on everything I've tried. I believe this is possible but running out of energy and ideas and think I've exhausted the wisdom of the web. Is it really this difficult? Let me know what I'm doing wrong.
... View more
Labels:
- Labels:
-
Apache NiFi
11-03-2023
02:45 AM
Thanks, Matt! We've been pressing hard for a year now to get some migration work done from an outmoded ETL tool, and as I've moved along, there's a lot I haven't stopped to truly understand. I had seen the notice about variable registry only before, but didn't truly appreciate what that meant. Now I do! And btw, I solved the problem by calling the "udpate" API directly from an InvokeHTTP processor where there's no restriction on using attributes. Works like a charm!
... View more
11-01-2023
10:15 AM
I moved a working flow that is populating Solr indexes from one process group to another. In the original, the SolrLocation property of the PutSolrContentStream processor is populated using two parameters: #{solr_url}/#{solr_index_name_a} It's done this way because a QueryRecord processor is used to split the record set into two groups, and one path uses the "a" index, and the other, the "b" index. However, in the new flow, I have to append a year value (i.e., "2023") to the name of the index depending on earlier processing. To accomplish this, I am holding the name of the index in an attribute instead of a parameter. At the appropriate time in the flow, I use an UpdateAttribute processor to append the correct year to the index name. Then further down, I have the PutSolrContentStream processor, and I populate the SolrLocation property like this: #{solr_url}/${solr_index_name_a} This fails with an "HTTP ERROR 404 NOT FOUND". It took a lot of trial an error, but I have discovered if I hard code the index name to a parameter, and set the SolrLocation name using two parameters (as in the original flow) instead of a parameter and an attribute like this: #{solr_url}/#{solr_index_name} it works. I move back to the attribute, and I get 404. In testing, I inserted in the middle an UpdateAttribute where I create an attribute called coreURL and set it to the value of the parameter + attribute, and I use that attribute instead as the SolrLocation. No dice. I then copy and paste the value of coreURL into SolrLocation (i.e., a hard-coded URL), and it works. It looks to me that, despite the documentation saying SolrLocation supports Expression Language, it doesn't, because, I've tried many variations, and any time I introduce an attribute to SolrLocation, the processor fails. with a 404 Version is 11.6.3.
... View more
Labels:
- Labels:
-
Apache NiFi
-
Apache Solr
09-22-2023
07:47 AM
Same problem, but this did not help me. The solution for me was found on stackoverflow. Change the Result RecordPath in the LookupRecord processor to a single forward slash. https://stackoverflow.com/questions/49674048/apache-nifi-hbase-lookup
... View more
07-31-2023
05:34 AM
Thanks for the suggestions. I ended up moving everything to stored procedures in the database, which are run under the existing context (service), so no need for a sensitive parameter.
... View more