About mburgess

mburgess · ‎10-10-2018

session.get() only returns a FlowFile reference (or null if one is not available). Instead you want something like session.get(9) which will return a List<FlowFile> guaranteed not to be null, but it may have size < 9 so your check should work in this case. Having said that, are you sure you want to transfer them to failure? If you're planning on just routing them back to the ExecuteScript (until 9 are available) then you could just do session.rollback() or session.transfer(flowFile, Relationship.SELF)

mburgess · ‎10-05-2018

The ReplaceText puts the CLOB data directly into the content, which is why the error message occurs. You could use a PreparedStatement to get around this, but you very likely don't want to use an attribute as the CLOB is large. The alternative is PutDatabaseRecord, you can provide a JsonTreeRecordReader (with schema) and it will generate the prepared statement but use the content to get the CLOB data rather than an attribute. If they won't know the schema (but it only contains primitive types) then they could use InferAvroSchema if necessary. However it is preferable to provide the schema directly via the Record Reader and/or a Schema Registry.

mburgess · ‎09-24-2018

You should be able to use ValidateRecord for this, although your reader schema might need to treat the incoming field as an int/long so it will read in negative numbers successfully. Then your write schema can have that field with its correct type (date, e.g.) You can also use QueryRecord for this, it allows you to query the flow file fields using SQL, such as "SELECT * FROM FLOWFILE WHERE columnDate >= 0" or whatever is appropriate.

mburgess · ‎09-21-2018

Can you explain more about your use case? Why do you need to extract ~100 fields, how are they being used downstream? I wonder if a record-based processor or something would be more appropriate...

mburgess · ‎09-19-2018

If you know the fields, you can use JoltTransformJSON on the original input JSON so you don't have to use SplitJson, here's a spec that will do the (explicit) field name conversion: [ { "operation": "shift", "spec": { "data": { "Table": { "*": { "Age": "data.Table[#2].age", "FirstName": "data.Table[#2].first_name", "LastName": "data.Table[#2].last_name" } } } } } ] You could also use UpdateRecord, you'd just need separate schemas for the JsonTreeReader and JsonRecordSetWriter.

mburgess · ‎09-19-2018

You can use JoltTransformJSON to convert this JSON into just the "name: value" pairs used by PutDatabaseRecord, here is a spec that will do it: [{ "operation": "shift", "spec": { "columns": { "*": { "@(value)": "[#1].@(1,name)" } } } }] If you find that PutDatabaseRecord is slow, it's likely because you're putting one record at a time into the DB. Instead consider using MergeRecord to bundle together more CDC events, then you can use the following spec to transform all of them for use in PutDatabaseRecord as a "micro-batch": [{ "operation": "shift", "spec": { "*": { "columns": { "*": { "@(value)": "[#4].@(1,name)" } } } } }]

mburgess · ‎09-06-2018

Maybe try without the string() function around it?! I'm not sure, since I used the transform above and it worked...

mburgess · ‎08-29-2018

I believe your "total" variable is local to the method, so you won't be able to refer to it later. Try changing the references to "self.total", that will put the variable as a member of the class instance, so you can get at it later with ConvertFilesData.total. Also note that putAttribute() expects a String for the value of the attribute, so you'll need str(ConvertFilesData.total) there.

mburgess · ‎08-27-2018

The NiFi 1.5.0 distribution should only include 1.5.0 NARs, I'm not sure where that 1.1.0 NAR came from, but it should certainly be replaced with the official 1.5.0 release version. You can get the nifi-hive-nar-1.5.0.nar from your NiFi distribution, or you can download it separately from the Apache repo here.

mburgess · ‎08-22-2018

ConvertRecord is mostly for changing data formats, not structure. UpdateRecord is more appropriate, but I don't believe this is currently possible, as the "address" field doesn't exist in the input, and we don't currently update the schema in that case. I've filed NIFI-5524 to cover this improvement. You'll also be able to accomplish this with JoltTransformRecord when NIFI-5353 is implemented. In the meantime you can use ConvertRecord to convert from flat CSV into flat JSON (the writer can inherit the record schema), then use JoltTransformJSON to push the fields into the "address" object, here's a spec that will do that: [ { "operation": "shift", "spec": { "*": { "address1": "[#2].address.address_address1", "zipcode": "[#2].address.address_zipcode", "*": "[#2].&" } } } ]

Online	Offline
Last Visited	‎01-16-2026 01:45 PM

Member Since	‎11-16-2015 02:21 PM
Last Visited	‎01-16-2026 01:45 PM
Posts	911
Kudos received	662

Cloudera Community

Re: Compare data within the JSON using NIFI

Re: how to join three csv files like sql on condit...

Re: How to see the Data Provenance and Lineage in ...

Re: Apache NiFi - RouteText has no matches

Re: Nifi Building error when creating a brand new ...

Re: NiFi - Script count fileFlows Groovy

Re: Nifi Processor : Insert data into CLOB column

Re: NIFI - How can I validate record?

Re: Multiple properties in EvaluateJsonPath proces...

Re: How to manipulate json key values using NiFI

Re: Nifi 1.7, How to insert json data to MySQL s...

Re: NiFi: Extract atrribute value from XML using E...

Re: putAttribute which is generated from Nifi Exec...

Re: All the Hive processors have change to a unver...

Re: want to convert csv to nested json using nifi