Member since
06-14-2023
95
Posts
33
Kudos Received
8
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3843 | 12-29-2023 09:36 AM | |
5651 | 12-28-2023 01:01 PM | |
1110 | 12-27-2023 12:14 PM | |
558 | 12-08-2023 12:47 PM | |
1749 | 11-21-2023 10:56 PM |
10-01-2023
01:29 PM
If you can run this python code under Python without any external modules, you should be able to run it as a scripted processor and have everything happen inside of NiFi.
... View more
09-21-2023
03:46 PM
Post an example of the actual CSV and I can see if I can help with this.
... View more
09-18-2023
01:11 PM
Forgot the ";" in the replacement value $1($2='$3');
... View more
09-16-2023
04:35 PM
I would do this with a Groovy based InvokeScriptedProcessor Using this code: import groovy.json.JsonOutput
import groovy.json.JsonSlurper
import java.nio.charset.StandardCharsets
import org.apache.commons.io.IOUtils
class GroovyProcessor implements Processor {
PropertyDescriptor BATCH_SIZE = new PropertyDescriptor.Builder()
.name("BATCH_SIZE")
.displayName("Batch Size")
.description("The number of incoming FlowFiles to process in a single execution of this processor.")
.required(true)
.defaultValue("1000")
.addValidator(StandardValidators.POSITIVE_INTEGER_VALIDATOR)
.build()
Relationship REL_SUCCESS = new Relationship.Builder()
.name("success")
.description('FlowFiles that were successfully processed are routed here')
.build()
Relationship REL_FAILURE = new Relationship.Builder()
.name("failure")
.description('FlowFiles that were not successfully processed are routed here')
.build()
ComponentLog log
void initialize(ProcessorInitializationContext context) { log = context.logger }
Set<Relationship> getRelationships() { return [REL_FAILURE, REL_SUCCESS] as Set }
Collection<ValidationResult> validate(ValidationContext context) { null }
PropertyDescriptor getPropertyDescriptor(String name) { null }
void onPropertyModified(PropertyDescriptor descriptor, String oldValue, String newValue) { }
List<PropertyDescriptor> getPropertyDescriptors() { Collections.unmodifiableList([BATCH_SIZE]) as List<PropertyDescriptor> }
String getIdentifier() { null }
JsonSlurper jsonSlurper = new JsonSlurper()
JsonOutput jsonOutput = new JsonOutput()
void onTrigger(ProcessContext context, ProcessSessionFactory sessionFactory) throws ProcessException {
ProcessSession session = sessionFactory.createSession()
try {
List<FlowFile> flowFiles = session.get(context.getProperty(BATCH_SIZE).asInteger())
if (!flowFiles) return
flowFiles.each { flowFile ->
List data = null
session.read(flowFile, {
inputStream -> data = jsonSlurper.parseText(IOUtils.toString(inputStream, StandardCharsets.UTF_8))
} as InputStreamCallback)
List outputData = []
data.each { order ->
outputData.add("${order.orderId} ${order.orderName}")
order.orderItems.each { orderItem ->
outputData.add("${orderItem.orderItemId} ${orderItem.orderItemName}")
}
}
FlowFile newFlowFile = session.create()
newFlowFile = session.write(newFlowFile, { outputStream -> outputStream.write(outputData.join('\n').getBytes(StandardCharsets.UTF_8)) } as OutputStreamCallback)
session.transfer(newFlowFile, REL_SUCCESS)
session.remove(flowFile)
}
session.commit()
} catch (final Throwable t) {
log.error('{} failed to process due to {}; rolling back session', [this, t] as Object[])
session.rollback(true)
throw t
}
}
}
processor = new GroovyProcessor() Don't let all that code scare you when the part that's doing the formatting is only these lines: This is the generated output:
... View more
09-16-2023
01:38 PM
Are you or have you considered leveraging ES bulk API? Bulk API | Elasticsearch Guide [8.9] | Elastic
... View more
09-16-2023
01:30 PM
1 Kudo
ReplaceText processor should work.. This RegEx search pattern should do the trick: ^(.*?)\(([^=]*)=([^\)]*)\);$ With this replacement value: $1($2='$3') Don't forget to set the evaluation mode to line-by-line...
... View more
09-16-2023
01:15 PM
As far as I know, NiFi leverage Jython and would currently limit you to using Python 2.7 compatible code and only modules written in pure Python. Home | Jython
... View more
09-16-2023
01:10 PM
This doesn't look like a valid regex pattern \E^modbus_log\.\d{4}-\d{2}-\d{2}_\d{2}-\d{2}$ Maybe it should be ^modbus_log\.\d{4}-\d{2}-\d{2}_\d{2}-\d{2}$
... View more
08-28-2023
07:10 AM
The 2 lines I provided are the very first lines and then all your code comes after.
... View more
07-29-2023
07:56 PM
NiFi is awesome with so many out-of-the-box processors...however, I have found sometimes a very specialized scripted Groovy processor that fetches 1000 or more files at a time to perform significantly faster... especially if your custom processor consolidates several processors into 1.
... View more