About SAMSAL

SAMSAL · ‎05-19-2023

Can you please share the configuration for the executesql processor and any record writer you used. Also please share the flowfile output.

SAMSAL · ‎05-19-2023

You can use ExecuteSQL pr ExecuteSQLRecord Processor for that. The first one will give you the result in avro format and the second one you can specify the format of the output by setting the record writer property.

SAMSAL · ‎05-18-2023

what is the format of the records ? Are they json array of records ? Can you provide sample data or an example of how your records are structured in the flowfile?

SAMSAL · ‎05-18-2023

Hi, That depends on your input and what kind of validation you are trying to do. can you provide more information on that? For example the blow flow validate json input against Avro schema: The input Json in the GenerateFlowFile processor looks like as follows: { "records": [ { "name":"John", "age": 25 }, { "name":"Smith", "age": 33 } ] } The configuration for the ValidateRecord Processor looks like: Where the SchemaText Value: { "type": "record", "name": "Record", "fields": [ { "name": "records", "type": { "type": "array", "items": { "type": "record", "namespace": "Record", "name": "records", "fields": [ { "name": "name", "type": "string" }, { "name": "age", "type": "long" } ] } } } ] } According to the Schema the input is valid json , however if you change age value to null or some string it will be invalid. The "Validation Details Attribute Name" Property in the ValidationRecord will store the validation message in the specified attribute name. Hope that helps.

SAMSAL · ‎05-16-2023

Hi, I noticed you are using "$(cdn)" instead of "${cdn}" for sql.args1.value to probably reference the flow file attribute cdn which you are using as merge key. That is probably why the merge has no effect because its looking for the string "$(cdn)" instead of the attribute value "${cdn}".

SAMSAL · ‎05-13-2023

Hi @Arash , Not sure if there is a reader\writer that can work semi-structured data. You can develop your custom reader\writer but that will be an effort. Since you are getting your input as multiple json records lines you can either use SplitText processor to split each json record into its own flowfile and then process each record independently, or convert the input into Json array using two ReplaceText processors ( see screenshot below), then use QueryRecord & UpdateRecord with JsonTreeReader\Writer. First ReplaceText: replace line break with comma 2ed ReplaceText: Surround the entire text with [] Hope that helps. Thanks

SAMSAL · ‎05-12-2023

Hi @rafy , I dont think the QueryRecord is suppose to work this way but I could be wrong. The query record basically filters from the root array and not the nested array. Since your input is not an array json object on the root this is not going to work. and if the filter " RPATH_STRING(data, '/room')='A'" is suppose to work (not sure why its not) it will return the entire record from the root and not just the subset. I think the question has been asked before but there was no answer: https://community.cloudera.com/t5/Support-Questions/Select-a-subset-of-data-using-NiFi-QueryRecord/td-p/348002 Now to resolve your problem, you have two options of processors : Option 1: EvaluateJsonPath->QueryRecord->JsonJoltTransformation where processors are configured as follows: EvaluateJsonPath : to get the data array into root array QueryRecord : To Query the required record based on the ${ip} attribute: JsonJoltTransformation: To convert back to the required schema with data array spec: [ { "operation": "shift", "spec": { "*":"data[#].&" } } ] Option 2: Just one JoltTransformationJson with the following spec: [ { "operation": "shift", "spec": { "data": { "*": { "room": { "${ipAttr}": { "@2": "data[0]" } } } } } } ] Note: I had to change the ip attribute name to ipAttr since ip is reserved Expression Language function.

SAMSAL · ‎05-11-2023

I think you had it right but you need to convert the division value into string before applying split on it. Here are the steps: "decImporto": "=divide(@(1,Importo),@(3,Quantita))", "strImporto": "=toString(@(1,decImporto))", "array_importo": "=split('[.]',@(1,strImporto))", "pad_importo": "=rightPad(@(1,array_importo[1]), 8, '0')", "Importo": "=concat(@(1,array_importo[0]),'.',@(1,pad_importo))"

SAMSAL · ‎05-11-2023

Hi, Please see the modified spec below. The comments indicate what I had to do to make your modify-overwrite-beta spec works based on the input json. I hope it works. [{ "operation": "modify-overwrite-beta", "spec": { "FatturaElettronicaBody": { // The DatiGenerali level is not found in the input JSON // "DatiGenerali": { "DatiBeniServizi": { "DettaglioLinee": { "*": { //level3 "ScontoMaggiorazione": { // level2 "*": { // Quantita is located level 3 and not level 1 "Importo": "=divide(@(1,Importo),@(3,Quantita))" } } } } } //} } } }]

SAMSAL · ‎05-11-2023

Hi, Will that helps: https://stackoverflow.com/questions/45052625/how-can-i-change-the-timezone-of-a-column-in-apache-nifi

Online	Offline
Last Visited	‎05-08-2025 03:43 AM

Member Since	‎07-29-2020 02:31 PM
Last Visited	‎05-08-2025 03:43 AM
Posts	574
Kudos received	323

Cloudera Community

Re: CSVReader and CSVRecordSetWriter doesn't consi...

Re: Jolt spec to flatten the nested JSON

Re: CSVReader and CSVRecordSetWriter doesn't consi...

Re: Converting Nested JSON to Flat JSON using JOLT

Re: NIfi: javax.security.auth.login.LoginExceptio...

Re: Get sql records count

Re: Get sql records count

Re: ValidateRecord - Do not process valid records ...

Re: ValidateRecord - Do not process valid records ...

Re: PutSQL PG with MERGE go in success but it does...

Re: Batch processing semi-structred JSON

Re: Filtering Json record with QueryRecord process...

Re: division of two fields using jolt spec

Re: division of two fields using jolt spec

Re: Converting timestamp in NIFI