Member since
01-27-2023
229
Posts
73
Kudos Received
45
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
130 | 02-23-2024 01:14 AM | |
210 | 01-26-2024 01:31 AM | |
196 | 11-22-2023 12:28 AM | |
423 | 11-22-2023 12:10 AM | |
581 | 11-06-2023 12:44 AM |
06-15-2023
06:37 AM
@bhadraka, Out of the box, I would try the following two scenarios, to see which one fits your use case better: 1) I would use Tail File to monitor that CSV File. I assume that all the newly added rows are added to the end of the file so once your processor gets executed, the TailFile will output either one line per flowfile or multiple lines per flowfile, depending on your schedule. 2) I would use FetchFile-ListFile or GetFile and extract the CSV File, without deleting the file. I will then proceed to split the records using SplitRecord and only process the last row, using an RouteOnAttribute based on fragment index.
... View more
06-08-2023
12:24 AM
yea, sorry, but this is a topic where I am no expert 😞 I do not really understand how certificates work and how they should be generated and used. Nevertheless, if you are certain that your certificate is correct and you should be able to connect to your Elasticsearch, you should define it in your SSL Context Service and then proceed with configuring your NiFi processor in order to extract the data your need. Here is how to configure the SSL Context Service for Elastic Search: https://community.cloudera.com/t5/Support-Questions/Configure-StandardSSLContextService-for-Elasticsearch/m-p/302719 And here is an example on how you should configure the NiFi Processor: https://nathanlabadie.com/streaming-from-elastic-to-syslog-via-apache-nifi/
... View more
06-07-2023
07:54 AM
1 Kudo
@SandyClouds, The answer is any case not an easy one, as it mostly depends on what you are planning to do, how, how often and so on 🙂 First things first, you need to know that a Cluster of 5 strong machines is much better than a Cluster of 20 small machines. NiFi is recommended to be scaled vertically and not horizontally. Now, regarding your questions, to start, if you have enough resources on your single node to be able to sustain all those workflows + 10% (if necessary as failover), than you do not need a cluster to perform your tasks. There are pro and cons for using a standalone instance, as well as an cluster. To name a few: Single Node: PROs: - easy to manage. - easy to configure. - no https required. CONs: - in case of issues with the node, you NiFi instance is down. - it uses plenty of resources, when it needs to process data, as everything is done on a single node. Cluster: PROs: - redundancy and failover --> when a node goes down, the others will take over and process everything, meaning that you will not get affected. - the used resources will be split among all the nodes, meaning that you can cover more use cases as on a single node. CONs: - complex setup as it requires a Zookeeper + plenty of other config files. - complex to manage --> analysis will be done on X nodes instead of a single node. Regarding the Docker question, here it is up to you. I am not really a big fan of Docker so my personal opinion is here that you should use a separate physical server with SSD and good CPU and RAM, especially when you want to process analytical workload (billions of actions per hour/day). So, as a conclusion, both standalone and cluster are good options to use NiFi, but you will have to choose what you want, based on your project requirements and based on your project schedule (for example if new flows will come, you will need to increase the resource and so on)
... View more
06-02-2023
03:45 PM
@Dracile, I do not think that SplitJson is the correct processor for you. What you are trying to achieve might be possible using some JOLT transformations. Unfortunately, I am not near a computer to test a correct transformation but I know that @SAMSAL has plenty of experience in using jolts and he might be able to further assist you.
... View more
05-31-2023
03:12 AM
@Fredi, I am not an expert when it comes to using ExecuteScript in NiFi, as I mostly go for ExecuteStreamCommand, but I really recommend your to have a look on the following two links (until somebody with far more experience provides you with an answer), as they explain everything you need to know when it comes to executing a Jython script in NiFi: See Part 2 for I/O on FlowFiles: https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-2/ta-p/249018 See Part 1 for FlowFile Creation: https://community.cloudera.com/t5/Community-Articles/ExecuteScript-Cookbook-part-1/ta-p/248922 If I were to look at your code and the examples provided by @mburgess, you are missing some important steps and that might be the cause of your error.
... View more
05-31-2023
03:01 AM
Well I am not expert in migrating from a version to another so my answer might not be good enough for you :(. Besides that, I had no time to read the release notes for 1.21.0 and I am not quite sure if anything changed in terms of config files. Assuming that you will keep the hostname and the port for each nifi node and you are using the embedded zookeeper, you should: 1. Stop the current NiFi instance. 2. Copy the authorizations.xml, authorizers.xml, bootstrap.conf, flow.json.gz, flow.xml.gz, logback.xml, login-identity-providers.xml, nifi.properties, stateless.properties, state-management.xml, users.xml and zookeeper.properties into the conf folder from within your new nifi instance. 3. Make sure that in nifi.properties, on your new instance, you are pointing to the same content repository, database repostiory (and all the other repositories) as in the previous instance --> assuming that you followed the best practices and had all those repositories moved on separate disks. Otherwise, make sure that you process all your data on your old instance and start fresh on your new instance. 4. Start the new NiFi Instance. If you are just interested in migrating just the flows from the canvas, you can add all your flows into a template and save them on your local machine. Afterwards, you can open your new nifi instance, upload your template and add it to your canvas.
... View more
05-30-2023
07:05 AM
1 Kudo
@ranaIrfan, I am not quite sure which version you are using but at least since 1.19, you have the ListGoogleDrive and FetchGoogleDrive Processors, which should be used in order to extract data out of your Google Drive Folder. FetchGoogleDrive --> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-gcp-nar/1.19.0/org.apache.nifi.processors.gcp.drive.FetchGoogleDrive/index.html ListGoogleDrive --> https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-gcp-nar/1.19.0/org.apache.nifi.processors.gcp.drive.ListGoogleDrive/
... View more
05-25-2023
03:05 AM
As much as I would like to help, the provided error does not really help 😞 More than certain there is much more in your logs, either on your NiFi instance (see all the logs) either on your Registry. You should try and remember everything which has been performed since the last time you know it worked and try to revert those actions. Otherwise, based on the provided error message, you are out of luck 😞
... View more
05-25-2023
12:59 AM
everybody is entitled to an opinion but may I ask why are you saying this? 🙂 As you are on a NiFi post, I assume that you are referring to the Cloudera NiFi documentation? I find it very helpful, especially combined with the NiFi's original documentation. It even has some additional thins compared to the original documentation. No matter the feedback, positive or negative, it is good when you know what to do with it. In your case, if you would provide a better and more structured feedback, maybe somebody from Cloudera would understand your point of view and he/she could modify the documentation 🙂
... View more
05-23-2023
11:35 PM
@Adhitya, I am not quite sure what you mean with response topic, but as far as I can tell, you need to use ConsumeMQTT in order to "ingest" data out of the MQTT brokers/topics. If you want to publish within the brokers/topics, you will have to use PublishMQTT. In both cases, you will have to choose the MQTT Specification Version, where you will have to select something from the following options: v3 Auto, v3.1.1, v3.1.0 and v5.0. See: https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-mqtt-nar/1.21.0/org.apache.nifi.processors.mqtt.ConsumeMQTT/
... View more