Member since
11-16-2015
911
Posts
668
Kudos Received
249
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 697 | 09-30-2025 05:23 AM | |
| 1069 | 06-26-2025 01:21 PM | |
| 929 | 06-19-2025 02:48 PM | |
| 1094 | 05-30-2025 01:53 PM | |
| 12258 | 02-22-2024 12:38 PM |
07-08-2018
02:40 AM
2 Kudos
How large are the incoming flow file contents? If they are fairly small, you could use ExtractText before ExecuteSQL to get the contents into an attribute (i.e. saving them for later), then Shu's approach (with/out the SplitRecord depending on your use case), including ExtractText to get the ExecuteSQL results into an attribute, followed by ReplaceText to restore the original content. Note that even if your use case supports this much use of memory, you'll likely want an UpdateAttribute to delete any attributes you're not using after these suggestions. PutSQL (as of NiFi 1.5.0 via NIFI-4522) has the kind of capability you're describing (i.e. retain the original content but execute SQL statements), but your use case seems a bit more esoteric with respect to what ExecuteSQL does. Is it possible that LookupAttribute would work for what you're trying to do?
... View more
07-05-2018
02:54 PM
1 Kudo
As of NiFi 1.7.0 (via NIFI-4456) you can configure the JsonReader to read the above JSON, meaning you can use SplitRecord if you really need to split them up. However depending on your use case, you may be able to use the record-aware processors and/or JoltTransformJSON to handle all the objects in one flow file.
... View more
07-03-2018
09:21 PM
Jeez I would hope not, I'm not aware of any platform differences for Jayway (the underlying library used to do JSONPath stuff in NiFi)
... View more
07-02-2018
09:59 PM
Hmm that's weird, this works for me:
... View more
07-02-2018
09:54 PM
1 Kudo
If they only have one colon, you could use the following transform in JoltTransformJSON to change the names to be valid in Avro: [
{
"operation": "shift",
"spec": {
"*:*": "&(0,1)_&(0,2)",
"*": "&"
}
}
] If you are translating from XML using a stylesheet, perhaps you could do a similar thing with the XSLT. Also as of NiFi 1.7.0 there is an XML record reader, so you could also use UpdateRecord to replace the values from the namespaced fields into a field with an Avro-valid name.
... View more
07-02-2018
07:49 PM
1 Kudo
Try the following as your JSONPath expression in EvaluateJSONPath: concat($.name,"_",$.unique-id) Expression Language is not supported in the JSONPath expressions, so if you need to use flow file attributes you'll need the additional processor to join the two attributes together as you described.
... View more
07-01-2018
04:30 AM
1 Kudo
In general, QueryDatabaseTable (QDT) must be scheduled at a rate commensurate with the values in the Maximum Value Column. Using a Maximum Value Column of type Date (with no timestamp) means you should only check for new values (i.e. schedule the QDT processor) each day, not every 5 minutes. Otherwise how would the processor know to grab rows with the current day that have come in since the last time it checked? It only looks at the date and (hypothetically) would either grab all the data with that value for the Date, leading to duplicate data, or (as it actually behaves) would look for data with a value of the next day, which it won't find until the following day, also leading to data loss. Possible workarounds are to schedule QDT to run once a day, or add/alter a column to include timestamp and schedule QDT accordingly, or use a different column (possibly via a DB view) that is always increasing for each new row as the Maximum Value Column. Hopefully someday there will also be an Oracle CDC processor (likely leveraging LogMiner) to fetch changes in near-real-time, there may be a way to use QDT and Oracle/LogMiner tables to emulate this capability, if the changes are available in a table with the same qualities (i.e. a Maximum Value Column that is always increasing w.r.t. to the Run Schedule of QDT).
... View more
06-30-2018
04:52 AM
Record-based processors should be writing the "record.count" attribute to outgoing flow files, but PutElasticsearchHttpRecord does not currently. I have written NIFI-5356 to cover this improvement. In the meantime, if you are using other record-based processors upstream, you may find you already have a record.count attribute, and since PutElasticsearchHttpRecord transfers the entire flow file to either success or failure, then if the attribute is there and the flow file is transferred to success, then "record.count" correctly indicates the number of records loaded into ES by that processor (for that flow file).
... View more
06-29-2018
03:06 PM
As of NiFi 1.5.0 (via NIFI-4522), you can issue a SQL query in PutSQL while still retaining the incoming flow file contents. For your case, you could send the CSV file to PutSQL and execute a "CREATE TABLE IF NOT EXISTS" statement, which will create the table the first time but allow the CSV to proceed to the "real" destination processor, likely PutDatabaseRecord.
... View more
06-28-2018
06:44 PM
1 Kudo
HDF 3.1.1.0 was built with HDP Hive 2.6.4 libraries, which are not compatible with HDP Hive 2.6.2 (there were evidently some changes to the Thrift interface that were not backwards compatible. The HDF 3.0.x.y series should be compatible with HDP Hive 2.6.2, I think as of this writing the latest is HDF 3.0.2.6.
... View more