About mburgess

mburgess · ‎07-12-2018

There currently aren't any processors to do incremental insert from a fetch all in one processor. However you could use QueryDatabaseTable or GenerateTableFetch to do the incremental fetch, then PutDatabaseRecord to insert the rows into the target database. Either of the fetch processors, if you specify a Maximum Value Column, will determine the current maximum value for that column, and the next time it runs, it will only fetch rows as your statement is above, where the column value > current maximum. The biggest difference between GenerateTableFetch and QueryDatabaseTable is that GTF generates SQL into the flow files, which then usually get executed with the ExecuteSQL processor. This allows you to distribute the SQL statements among nodes in a NiFi cluster so they can be executed in parallel. QueryDatabaseTable generates and executes the SQL each time, and is designed to run on the primary node only (if in a cluster). GTF figures out the current maximum value by doing its own query for MAX(), where QDT iterates through the rows and keeps the current maximum value while it is processing and outputting the rows.

mburgess · ‎07-12-2018

In this case the problem is that "time" is a reserved word in PostgreSQL, so it needs an explicit "AS" for the alias. If the column were not a reserved word, the processor should work fine (I tested with a timestamp column named "ts" and it works without the AS). I have written NIFI-5419 to handle the improvement of aliases in the DB fetch processors (such as GenerateTableFetch). A workaround is to use QueryDatabaseTable, you can't distribute the SQL among a cluster, but it won't issue that MAX() query that fails.

mburgess · ‎07-11-2018

It is usually not recommended to store large values in attributes as they are kept in memory which can cause issues for the entire flow. Can you share an example JSON and what you're trying to get as a result? You might be able to use UpdateRecord to create the new fields in-place (i.e. in the flow file contents) rather than having to extract fields into attributes.

mburgess · ‎07-10-2018

You can't do it recursively but you can manually do it if you know the max depth of the nested fields, you just need a similar spec to match each nested field, on the level that it matches. So instead of "*": "&", you'd try to match the nested fields underneath, with a sub-spec pretty much identical to the top-level one.

mburgess · ‎07-08-2018

How large are the incoming flow file contents? If they are fairly small, you could use ExtractText before ExecuteSQL to get the contents into an attribute (i.e. saving them for later), then Shu's approach (with/out the SplitRecord depending on your use case), including ExtractText to get the ExecuteSQL results into an attribute, followed by ReplaceText to restore the original content. Note that even if your use case supports this much use of memory, you'll likely want an UpdateAttribute to delete any attributes you're not using after these suggestions. PutSQL (as of NiFi 1.5.0 via NIFI-4522) has the kind of capability you're describing (i.e. retain the original content but execute SQL statements), but your use case seems a bit more esoteric with respect to what ExecuteSQL does. Is it possible that LookupAttribute would work for what you're trying to do?

mburgess · ‎07-05-2018

As of NiFi 1.7.0 (via NIFI-4456) you can configure the JsonReader to read the above JSON, meaning you can use SplitRecord if you really need to split them up. However depending on your use case, you may be able to use the record-aware processors and/or JoltTransformJSON to handle all the objects in one flow file.

mburgess · ‎07-03-2018

Jeez I would hope not, I'm not aware of any platform differences for Jayway (the underlying library used to do JSONPath stuff in NiFi)

mburgess · ‎07-02-2018

Hmm that's weird, this works for me:

mburgess · ‎07-02-2018

If they only have one colon, you could use the following transform in JoltTransformJSON to change the names to be valid in Avro: [ { "operation": "shift", "spec": { "*:*": "&(0,1)_&(0,2)", "*": "&" } } ] If you are translating from XML using a stylesheet, perhaps you could do a similar thing with the XSLT. Also as of NiFi 1.7.0 there is an XML record reader, so you could also use UpdateRecord to replace the values from the namespaced fields into a field with an Avro-valid name.

mburgess · ‎07-02-2018

Try the following as your JSONPath expression in EvaluateJSONPath: concat($.name,"_",$.unique-id) Expression Language is not supported in the JSONPath expressions, so if you need to use flow file attributes you'll need the additional processor to join the two attributes together as you described.

Online	Offline
Last Visited	‎10-29-2025 10:31 AM

Member Since	‎11-16-2015 02:21 PM
Last Visited	‎10-29-2025 10:31 AM
Posts	905
Kudos received	659

Cloudera Community

Re: Compare data within the JSON using NIFI

Re: how to join three csv files like sql on condit...

Re: How to see the Data Provenance and Lineage in ...

Re: Apache NiFi - RouteText has no matches

Re: Nifi Building error when creating a brand new ...

Re: NIFI processor to execute multiple queries for...

Re: Can't use GenerateTableFetch with maximum valu...

Re: Nifi attribute containing large text value

Re: Converting JSON to Avro (Field names have spec...

Re: Is there any way to route the result of Execut...

Re: Split JSON flow file into JSON objects

Re: How to get multiple values at once from JSON f...

Re: How to get multiple values at once from JSON f...

Re: Converting JSON to Avro (Field names have spec...

Re: How to get multiple values at once from JSON f...