Member since
06-14-2023
95
Posts
33
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3843 | 12-29-2023 09:36 AM | |
5652 | 12-28-2023 01:01 PM | |
1110 | 12-27-2023 12:14 PM | |
558 | 12-08-2023 12:47 PM | |
1749 | 11-21-2023 10:56 PM |
06-30-2023
05:19 AM
@madhs Have you resolved your issue. If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.
... View more
06-26-2023
09:18 AM
@Ghilani You should know these 3 articles intimately if you want to use Execute Script https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-1/ta-p/248922 https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-2/ta-p/249018 https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-3/ta-p/249148 That said, it's sometimes helpful for me to see a working example and modify from there. As such, here is a github with a sample flow definition file (01_Fraud_Detection_Demo_Params_ExecuteScript.json) and script file(Fraud Demo ExecuteScript.py) that should work out of the box: https://github.com/cldr-steven-matison/Fraud-Prevention-With-Cloudera-SSB/tree/main/Templates Pay attention to imports and, then line 160 is what you want to get the flowfile. My flow ignores the content, but you should be able to find references in the Part 1 cookbook for anything you want to do w/ flowfile content.
... View more
06-26-2023
03:37 AM
The issue ended up being that ip is a reserved word (a default function) so changing the name of the attribute from ip to extractedIp solved it.
... View more
06-26-2023
12:02 AM
@Carson, Like @joseomjr wrote (but did not gave the entire link by mistake), you should take a look at the following Article as it describes exactly what you need --> https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-3/ta-p/249148 Basically, you add the property in your NiFi Processor and reference it to your parameter value and afterwards you call it in your script: myValue1 = myProperty1.getValue()
... View more
06-25-2023
12:21 PM
Apache PDFBox is supposed to allow you to merge PDF content. Since this is a Java library you can create a scripted Groovy processor to merge the files for you. https://pdfbox.apache.org https://javadoc.io/doc/org.apache.pdfbox/pdfbox/2.0.27/index.html
... View more
06-24-2023
08:16 PM
I've done this With several Java libraries and Groovy based InvokeScriptedProcessor. Just looked up Kazoo... You can do the same with Python/Jython as long as the Kazoo module is 100% Python and not C based
... View more
06-24-2023
05:55 PM
First, the incoming Data should be proper JSON that can be easily parsed and processed...i.e. something like this [
[
{
"ID": "100000",
"Date": "2022-09-22",
"Start Time": "08:00",
"End Time": "14:00"
},
{
"ID": "100001",
"Date": "2022-09-02",
"Start Time": "08:00",
"End Time": "14:00"
},
{
"ID": "100002",
"Date": "2022-09-02",
"Start Time": "08:00",
"End Time": "14:00"
}
],
{
"res_data": [
{
"Record": "Record 2",
"Description": "Invalid values for ID or Date"
},
{
"Record": "Record 3",
"Description": "Invalid values for ID or Date"
}
]
}
] Then Groovy code like this: import groovy.json.JsonOutput
import groovy.json.JsonSlurper
JsonSlurper jsonSlurper = new JsonSlurper()
JsonOutput jsonOutput = new JsonOutput()
List data = jsonSlurper.parseText('''
[
[
{
"ID": "100000",
"Date": "2022-09-22",
"Start Time": "08:00",
"End Time": "14:00"
},
{
"ID": "100001",
"Date": "2022-09-02",
"Start Time": "08:00",
"End Time": "14:00"
},
{
"ID": "100002",
"Date": "2022-09-02",
"Start Time": "08:00",
"End Time": "14:00"
}
],
{
"res_data": [
{
"Record": "Record 2",
"Description": "Invalid values for ID or Date"
},
{
"Record": "Record 3",
"Description": "Invalid values for ID or Date"
}
]
}
]
''')
println(data)
println("=" * 80)
List<Map<String, String>> transformedList = []
data[0].eachWithIndex { map, index ->
Map<String, Object> record = data[1]["res_data"].find { it["Record"] == "Record ${index + 1}" }
if (record) {
map.putAll(record)
}
transformedList.add(map)
}
println(jsonOutput.prettyPrint(jsonOutput.toJson(transformedList))) Creates an output like this: [[[ID:100000, Date:2022-09-22, Start Time:08:00, End Time:14:00], [ID:100001, Date:2022-09-02, Start Time:08:00, End Time:14:00], [ID:100002, Date:2022-09-02, Start Time:08:00, End Time:14:00]], [res_data:[[Record:Record 2, Description:Invalid values for ID or Date], [Record:Record 3, Description:Invalid values for ID or Date]]]]
================================================================================
[
{
"ID": "100000",
"Date": "2022-09-22",
"Start Time": "08:00",
"End Time": "14:00"
},
{
"ID": "100001",
"Date": "2022-09-02",
"Start Time": "08:00",
"End Time": "14:00",
"Record": "Record 2",
"Description": "Invalid values for ID or Date"
},
{
"ID": "100002",
"Date": "2022-09-02",
"Start Time": "08:00",
"End Time": "14:00",
"Record": "Record 3",
"Description": "Invalid values for ID or Date"
}
]
... View more
06-24-2023
05:18 PM
This is Groovy code that achieves what you've mentioned and should be able to adapt to your scripted processor. import groovy.json.JsonOutput
import groovy.json.JsonSlurper
JsonSlurper jsonSlurper = new JsonSlurper()
JsonOutput jsonOutput = new JsonOutput()
Map<String, String> data = jsonSlurper.parseText('''{
"Message": "\nRecord 1:\nRequired data is missing. \n\nRecord 2:\nprocessing failed\n"
}''')
println(data)
println("=" * 80)
List<String> records = data.Message.split("\\nRecord\\s\\d+:\\n").findAll { it.trim() != "" }
println(records)
println("=" * 80)
Map<String, String> messages = [:]
records.eachWithIndex { message, index ->
messages["Record ${index + 1}"] = message.trim()
}
println(jsonOutput.toJson(messages)) This is the output: [Message:
Record 1:
Required data is missing.
Record 2:
processing failed
]
================================================================================
[Required data is missing.
, processing failed
]
================================================================================
{"Record 1":"Required data is missing.","Record 2":"processing failed"}
... View more
06-24-2023
04:56 PM
Should look more like this: context.getProperty(externalRequestTokenDescriptor).evaluateAttributeExpressions().getValue()
... View more
06-22-2023
05:56 PM
This is likely what's generating the error... You're saying remove the Flow file in a loop and should only be done once. for element in found_elements: session.remove(flowFile) print(element.tag, element.text)
... View more