About mburgess

mburgess · ‎03-07-2017

NIFI-3413 describes a GetChangeDataCapture processor, which will generate events that correspond to database updates (rows added, tables changed, e.g.). The hope is that it will support MySQL, Oracle, and SQL Server (although the first release of the processor may not). Development of the MySQL CDC support (via its binlog files) is underway.

mburgess · ‎03-07-2017

You can use GenerateFlowFile to create 0-Byte flow files. Recent versions of NiFi also let you supply dynamic properties to GenerateFlowFile which means you can add a "filename" attribute set to "${filename}.sem", the filename part will be a timestamp/number. If your version of NiFi doesn't support dynamic properties in GenerateFlowFile, you can use UpdateAttribute afterward to do the same. Note that GenerateFlowFile runs continuously on the schedule you pick, so there isn't a concept of being "done" or creating a "batch". Instead you could manually (or via the REST API) stop the processor when you have the requisite number of flow files created. To write the files somewhere, use the PutFile processor.

mburgess · ‎03-07-2017

ReplaceText uses regular expressions, so if you can write a regex that will successfully match the field names you want, you should be able to use that. I usually avoid that approach for structured text like JSON, especially if I don't know what the input will look like. But if you know the format/structure, you may find that ReplaceText works fine for your needs.

mburgess · ‎03-07-2017

To put each of the ServiceCodes values into its own "row", you can use JoltTransformJSON with the following shift specification: { "operation": "shift", "spec": { "ServiceCodes": { "*": { "@(2,Time)": "[&].Time", "@(2,Subscription)": "[&].Subscription", "@": "[&].ServiceCode" } } } } Given your input above, it will produce the following: [ { "ServiceCode" : "SERVICE_CODE1", "Subscription" : "1234567", "Time" : "03/07/2017 11:45:46.365" }, { "ServiceCode" : "SERVICE_CODE2", "Subscription" : "1234567", "Time" : "03/07/2017 11:45:46.365" }, { "ServiceCode" : "SERVICE_CODE3", "Subscription" : "1234567", "Time" : "03/07/2017 11:45:46.365" }, { "ServiceCode" : "SERVICE_CODE4", "Subscription" : "1234567", "Time" : "03/07/2017 11:45:46.365" } ] This might be able to go directly into ConvertJSONToSQL, but if it doesn't, you can use SplitJSON with $[*] or $.* as the JSON Path expression, and it will divide the array up into one flow file per object in the array. Then you should be able to transform it to SQL.

mburgess · ‎03-07-2017

Your input is an array but that specification works on a single JSON object. Try one of those objects at a time. And when you run your flow, make sure you have the SplitJson before the JoltTransformJSON processor, to divide up the array into individual flow files, each containing a single JSON object. As I said above, if you need to process the entire array at once, you will need a different specification, and I couldn't create one that worked.

mburgess · ‎03-07-2017

If your configuration of ExecuteStreamCommand outputs an integer and you would like it in an attribute, try setting the "Output Destination Attribute" property of ExecuteStreamCommand to the attribute name you'd like, and use the "original" relationship to transfer the flow file downstream. That will give you a flow file with the same incoming content as well as an attribute whose name is of your choosing and whose value is the output stream returned (hopefully the same value you mention your command returns) by the command you are executing. If instead you want the exit code of the command, you will find it in the "execution.status" attribute of the outgoing flow file (see doc here).

mburgess · ‎03-06-2017

All Property Descriptors (required or optional) must have a Validator set explicitly, otherwise it will return the error you are seeing. It appears you are not looking to perform validation, but you still must set a validator, so on your optional properties add the following to the builder: .addValidator(Validator.VALID)

mburgess · ‎03-06-2017

What does your table look like? Is there a column that is guaranteed to be "strictly increasing" for each added/updated row? Sometimes this is the ID column (if using an autoincrementing integer that doesn't roll over), or perhaps a timestamp column such as "Last Updated". If you have no such column, then you will want to follow Bryan's advice on scheduling and start/stop.

mburgess · ‎03-06-2017

Are you trying to retain the structure of the JSON Array and/or objects, and just rename the fields? If so, try the JoltTransformJSON processor. I am guessing you will eventually need to split the JSON array into individual objects in order to insert them into your database? If so, then try SplitJson first (with a JSON Path expression of $[*] or $.*) to get each JSON object into its own flow file. Then you can use JoltTransformJSON with the following Shift specification: { "operation": "shift", "spec": { "X": "A", "Y": "B", "Z": "C", "W": "D" } } That should map your fields the way you have described (X=A, Y=B, Z=C, W=D). It may be possible to have a Shift or Chain specification that would handle the mapping for your entire original array, but I wasn't able to get a working spec (using this demo app), perhaps someone else (@Yolanda M. Davis ?) can chime in on that.

mburgess · ‎03-06-2017

This question has a "nifi-processor" tag, which NiFi processor are you using? Also which processor(s) are you using to get the email messages? I suspect you should be able to use RouteOnAttribute or RouteOnContent to send emails with ZIP attachments to some other relationship, and those without attachments can go directly to PutSolrContentStream (or whatever you're using to push data to Solr). Perhaps the branch with ZIP attachments can use processor(s) to remove the ZIP part of the attachment, retain the email message, and route back to the "main" branch to retry the "put".

Online	Offline
Last Visited	‎10-29-2025 10:31 AM

Member Since	‎11-16-2015 02:21 PM
Last Visited	‎10-29-2025 10:31 AM
Posts	905
Kudos received	658

Cloudera Community

Re: Compare data within the JSON using NIFI

Re: how to join three csv files like sql on condit...

Re: How to see the Data Provenance and Lineage in ...

Re: Apache NiFi - RouteText has no matches

Re: Nifi Building error when creating a brand new ...

Re: Incremental data loads from relational databas...

Re: How to create empty files per flow file sessio...

Re: I want to Insert data from json file into mysq...

Re: JSONtoSQL with JSON containing an array with a...

Re: I want to Insert data from json file into mysq...

Re: Assign an attribute to the output of ExecuteSt...

Re: Unit test failing for custom processor's 'opti...

Re: How to stop QueryDatabaseTable after processin...

Re: I want to Insert data from json file into mysq...

Re: Solr - expected mime type application/octet-st...