About joseomjr

cjervis · ‎06-30-2023

@madhs Have you resolved your issue. If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.

steven-matison · ‎06-26-2023

@Ghilani You should know these 3 articles intimately if you want to use Execute Script https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-1/ta-p/248922 https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-2/ta-p/249018 https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-3/ta-p/249148 That said, it's sometimes helpful for me to see a working example and modify from there. As such, here is a github with a sample flow definition file (01_Fraud_Detection_Demo_Params_ExecuteScript.json) and script file(Fraud Demo ExecuteScript.py) that should work out of the box: https://github.com/cldr-steven-matison/Fraud-Prevention-With-Cloudera-SSB/tree/main/Templates Pay attention to imports and, then line 160 is what you want to get the flowfile. My flow ignores the content, but you should be able to find references in the Part 1 cookbook for anything you want to do w/ flowfile content.

Wpq · ‎06-26-2023

The issue ended up being that ip is a reserved word (a default function) so changing the name of the attribute from ip to extractedIp solved it.

cotopaul · ‎06-26-2023

@Carson, Like @joseomjr wrote (but did not gave the entire link by mistake), you should take a look at the following Article as it describes exactly what you need --> https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-3/ta-p/249148 Basically, you add the property in your NiFi Processor and reference it to your parameter value and afterwards you call it in your script: myValue1 = myProperty1.getValue()

joseomjr · ‎06-25-2023

Apache PDFBox is supposed to allow you to merge PDF content. Since this is a Java library you can create a scripted Groovy processor to merge the files for you. https://pdfbox.apache.org https://javadoc.io/doc/org.apache.pdfbox/pdfbox/2.0.27/index.html

joseomjr · ‎06-24-2023

I've done this With several Java libraries and Groovy based InvokeScriptedProcessor. Just looked up Kazoo... You can do the same with Python/Jython as long as the Kazoo module is 100% Python and not C based

joseomjr · ‎06-24-2023

First, the incoming Data should be proper JSON that can be easily parsed and processed...i.e. something like this [ [ { "ID": "100000", "Date": "2022-09-22", "Start Time": "08:00", "End Time": "14:00" }, { "ID": "100001", "Date": "2022-09-02", "Start Time": "08:00", "End Time": "14:00" }, { "ID": "100002", "Date": "2022-09-02", "Start Time": "08:00", "End Time": "14:00" } ], { "res_data": [ { "Record": "Record 2", "Description": "Invalid values for ID or Date" }, { "Record": "Record 3", "Description": "Invalid values for ID or Date" } ] } ] Then Groovy code like this: import groovy.json.JsonOutput import groovy.json.JsonSlurper JsonSlurper jsonSlurper = new JsonSlurper() JsonOutput jsonOutput = new JsonOutput() List data = jsonSlurper.parseText(''' [ [ { "ID": "100000", "Date": "2022-09-22", "Start Time": "08:00", "End Time": "14:00" }, { "ID": "100001", "Date": "2022-09-02", "Start Time": "08:00", "End Time": "14:00" }, { "ID": "100002", "Date": "2022-09-02", "Start Time": "08:00", "End Time": "14:00" } ], { "res_data": [ { "Record": "Record 2", "Description": "Invalid values for ID or Date" }, { "Record": "Record 3", "Description": "Invalid values for ID or Date" } ] } ] ''') println(data) println("=" * 80) List<Map<String, String>> transformedList = [] data[0].eachWithIndex { map, index -> Map<String, Object> record = data[1]["res_data"].find { it["Record"] == "Record ${index + 1}" } if (record) { map.putAll(record) } transformedList.add(map) } println(jsonOutput.prettyPrint(jsonOutput.toJson(transformedList))) Creates an output like this: [[[ID:100000, Date:2022-09-22, Start Time:08:00, End Time:14:00], [ID:100001, Date:2022-09-02, Start Time:08:00, End Time:14:00], [ID:100002, Date:2022-09-02, Start Time:08:00, End Time:14:00]], [res_data:[[Record:Record 2, Description:Invalid values for ID or Date], [Record:Record 3, Description:Invalid values for ID or Date]]]] ================================================================================ [ { "ID": "100000", "Date": "2022-09-22", "Start Time": "08:00", "End Time": "14:00" }, { "ID": "100001", "Date": "2022-09-02", "Start Time": "08:00", "End Time": "14:00", "Record": "Record 2", "Description": "Invalid values for ID or Date" }, { "ID": "100002", "Date": "2022-09-02", "Start Time": "08:00", "End Time": "14:00", "Record": "Record 3", "Description": "Invalid values for ID or Date" } ]

joseomjr · ‎06-24-2023

This is Groovy code that achieves what you've mentioned and should be able to adapt to your scripted processor. import groovy.json.JsonOutput import groovy.json.JsonSlurper JsonSlurper jsonSlurper = new JsonSlurper() JsonOutput jsonOutput = new JsonOutput() Map<String, String> data = jsonSlurper.parseText('''{ "Message": "\nRecord 1:\nRequired data is missing. \n\nRecord 2:\nprocessing failed\n" }''') println(data) println("=" * 80) List<String> records = data.Message.split("\\nRecord\\s\\d+:\\n").findAll { it.trim() != "" } println(records) println("=" * 80) Map<String, String> messages = [:] records.eachWithIndex { message, index -> messages["Record ${index + 1}"] = message.trim() } println(jsonOutput.toJson(messages)) This is the output: [Message: Record 1: Required data is missing. Record 2: processing failed ] ================================================================================ [Required data is missing. , processing failed ] ================================================================================ {"Record 1":"Required data is missing.","Record 2":"processing failed"}

joseomjr · ‎06-24-2023

Should look more like this: context.getProperty(externalRequestTokenDescriptor).evaluateAttributeExpressions().getValue()

joseomjr · ‎06-22-2023

This is likely what's generating the error... You're saying remove the Flow file in a loop and should only be done once. for element in found_elements: session.remove(flowFile) print(element.tag, element.text)

Online	Offline
Last Visited	‎12-17-2024 09:55 PM

Member Since	‎06-14-2023 12:02 PM
Last Visited	‎12-17-2024 09:55 PM
Posts	95
Kudos received	33

Cloudera Community

Re: Nifi 2.0.0 M1 Installation error with python

Re: how to replace empty string with null in neste...

Re: ListenUDP Fault tolerance

Re: terminating kafka connection if publish kafka ...

Re: unable to resolve class groovy.yaml.YamlSlurpe...

Re: NiFi InvokeHTTP POST request authenticate via ...

Re: How can we read a flowfile content with a pyth...

Re: Why is there an error on the colon in ${attrib...

Re: Is it possible to reference parameters within ...

Re: How to merge two pdf files from two flow file...

Re: How to use external library in Apache nifi

Re: How to loop through array of json object and r...

Re: Getting Empty Array Using Groovy Script in Ni...

Re: InvokeScriptedProcessor - Can't get sensitive ...

Re: Apache NiFi already in use for an active callb...