About joseomjr

joseomjr · ‎02-01-2024

Have you seen this post? https://community.cloudera.com/t5/Support-Questions/Nifi-2-0-0-M1-Installation-error-with-python/m-p/381430

joseomjr · ‎01-05-2024

If clustered, is Zookeeper running on each node or has that been separated? Wondering if selecting a new master or having an acceptable quorum is contributing to the slowness.

joseomjr · ‎01-02-2024

I don't see any parquet NAR files in my NiFi 2.0.0-M1 install or in the Docker image.

joseomjr · ‎01-02-2024

If you were able to delete it, then it almost sounds like that attribute might have been something from the 1.X versions and not something new in the 2.X version.

joseomjr · ‎12-29-2023

@Heeya8876 , both @SAMSAL and I have recently gone through the adventures of getting 2.0.0-M1 to run with the Python extension enabled. Here are some findings so far on the Linux side of things. Java 21 is required (any platform) Python 3.9+ (any platform) is required (I believe @SAMSAL, correct me if I'm wrong, said Python 3.12 did NOT work, but we both got 3.11 to run) If it's installed, make sure it's the default set with "sudo update-alternatives java" Make sure your environment has JAVA_HOME defined with the path for Java 21 Make sure Python3.9+ is the default prior to running NiFi with "sudo update-alternatives --config python3" Executing python3 --version should show whichever version you set as your default and it should be 3.9~3.11 You can see what version was copied by NiFi in the directory "./work/python/controller/bin/python3 --version" If this is showing anything <3.9 then delete the work folder, follow the steps above, and try again. If you build a processor from scratch the Developer guide says to use this for your __init__ def __init__(self, **kwargs): super().__init__(**kwargs) You'll get an error...replace super().__init__(**kwargs) with pass like the examples that come with the install. Changes to your Python extensions are not immediate....NiFi polls the directory periodically to detect changes, download dependencies, and load the updated processors. Sometimes I had to restart NiFi to get it to detect my changes if my previous code update made it really unhappy. ./logs/nifi-python.log will be your friend for Python extension related issues If your Python extension has dependencies and it fails to download them you can see the command it attempted in nifi-python.log; I manually ran the commands in the logs and it downloaded the modules into the correct place and worked...perhaps there's a timeout for module downloads? (just a guess since the module had a ton of large dependencies) I don't think I saw it in the Developer's Guide but did notice while building a custom FlowFileTransform Python extension, the "content" data returned with the FlowFileTransformResult should be a string or byte array. @SAMSAL has additional insight on getting it to start up on Windows

joseomjr · ‎12-28-2023

Agree with @SAMSAL's approach and if you can provide a parameter or something in the header or request so your API returns a JSON response each time it'll make things a lot easier for you to parse and build the request for the next step in your flow.

joseomjr · ‎12-28-2023

ExecuteGroovyScript alternative with this input { "idTransakcji": "123", "date": "", "name": "sam" } import groovy.json.JsonOutput import groovy.json.JsonSlurper import java.nio.charset.StandardCharsets JsonSlurper jsonSlurper = new JsonSlurper() JsonOutput jsonOutput = new JsonOutput() FlowFile flowFile = session.get() if(!flowFile) return flowFile = session.write(flowFile, { inputStream, outputStream -> Map data = jsonSlurper.parse(inputStream) data = [ "id": data.idTransakcji, "user": [ "date": data.date?.isNumber() ? Long.parseLong(data.date) : null, "name": data.name ] ] outputStream.write(jsonOutput.toJson(data).getBytes(StandardCharsets.UTF_8)) } as StreamCallback) session.transfer(flowFile, REL_SUCCESS)

joseomjr · ‎12-28-2023

...a 3rd option because I like scripted processors 😂...using ExcecuteGroovyScript import groovy.json.JsonOutput import groovy.json.JsonSlurper import java.nio.charset.StandardCharsets JsonSlurper jsonSlurper = new JsonSlurper() JsonOutput jsonOutput = new JsonOutput() FlowFile flowFile = session.get() if(!flowFile) return flowFile = session.write(flowFile, { inputStream, outputStream -> List<Map> data = jsonSlurper.parse(inputStream) data.each { it.order_item = jsonSlurper.parseText(it.order_item) } outputStream.write(jsonOutput.toJson(data).getBytes(StandardCharsets.UTF_8)) } as StreamCallback) session.transfer(flowFile, REL_SUCCESS) Looks like a lot but this is what takes the string JSON and converts it to JSON: it.order_item = jsonSlurper.parseText(it.order_item)

joseomjr · ‎12-27-2023

PublishKafka must have an active connection to Kafka before it even attempts to send a FlowFile which mean it never even gets into the block of code that sends it and routes it to "success" or "failure" accordingly. Making sure your Kafka cluster is up and running should be the focus if this is what you're experiencing. My guess if this "error" topic you have is on the same Kafka cluster then, even if PublishKafka was able to route a FlowFile to "failure" when it's unable to connect to Kafka, it wouldn't work anyways.

joseomjr · ‎12-27-2023

How the FlowFile is distributed from your ListenUDP processor to the next in the flow is defined in the connection between them. Leveraging something like HAProxy, Nginx, or any other form of load balancer in front of your NiFi cluster would be a way to ensure you data is forwarded to any of the nodes that are still accessible as long as the cluster is up.

Online	Offline
Last Visited	‎12-25-2025 10:07 PM

Member Since	‎06-14-2023 12:02 PM
Last Visited	‎12-25-2025 10:07 PM
Posts	96
Kudos received	34

Cloudera Community

Re: Hosting API using HandleHttpRequest

Re: Nifi 2.0.0 M1 Installation error with python

Re: how to replace empty string with null in neste...

Re: ListenUDP Fault tolerance

Re: terminating kafka connection if publish kafka ...

Re: Installing NIFI 2.0.0 M2 on Ubuntu Linux java....

Re: Nifi - 2 nodes of the cluster take very long t...

Re: possible bug missing parquetreader version 2.0...

Re: nifi 2.0 bug with InvokeHTTP procdessor

Re: Nifi 2.0.0 M1 Installation error with python

Re: Request for Support with Passing Request Body ...

Re: how to replace empty string with null in neste...

Re: Preparing nested JSON using SQL in NiFi

Re: PublishKafkaProcessor , request is not going t...

Re: ListenUDP Fault tolerance