About SowmyaP

cotopaul · ‎12-07-2023

@SowmyaP : I encountered some similar issues and it was always related to the HEAP Memory and the CPU consumption. To solve the problem, I stopped the entire NiFi Cluster and started it all over again, making sure that the processors are all in STOPPED state (see Nifi.conf file from /conf for the property). Once you start your cluster back, make sure that you delete the file from the queue, to avoid future problems.

SowmyaP · ‎11-22-2023

Dears, I have a 3 node Nifi cluster. In my flow I am inserting data from an excel sheet to singlestore database by converting it into CSV and then using putdabaserecord processor for insertion, while I am trying to insert 6 lakh records it is taking 1 hour to insert into Singlestore table. Round robin load balancing mechanism is used and each node is having 4 CPU cores. How can I reduce the time taken to insert data into table. Thanks!

VidyaSargur · ‎11-20-2023

@SowmyaP, Has the reply helped resolve your issue? If so, please mark the appropriate reply as the solution, as it will make it easier for others to find the answer in the future.

MattWho · ‎07-26-2023

@SowmyaP I am not clear on what you are trying to accomplish by specific order of restarts, starts and stops. Can you provide some details around the use case you are trying to implement and why? In a NiFi cluster the cluster roles "Primary" node and "Cluster Coordinator" node are elected by Zookeeper. Other nodes on the cluster have no specific assigned role. Which node is assigned to these roles can change at anytime, so I am not clear on the importance of specific restart order. If you have designed dataflows around a requirement that a specific node is always the "Primary" elected node, that can't be guaranteed. In your first "Restart" scenario, assuming that what you mean by "secondary nodes" is any node that is not elected the "Primary" role, each secondary node will restart and rejoin the cluster. If one of those secondary nodes was elected the "Cluster Coordinator" role, that role will switch to another node still running in the cluster. The node elected the "primary" role should retain that role; however, when you then restart the "Primary" node, the "Primary" role will be elected to one of the other nodes in the cluster. When the previous elected "Primary" node rejoins cluster, it will NOT reassume the "Primary" node role. In your second "start" scenario, assuming all nodes are currently stopped, you can start just the node you want to be elected the "primary" role and wait for that node to completely start and get elected with both the "primary" and "cluster coordinator" roles. Essentially if you access the cluster UI at this point it would show 1 of 1 connected nodes. Then you can start your secondary nodes and they will join the already established cluster. Third "Stop" scenario, it really does not matter which node you stop first. They get their roles assigned by Zookeeper. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎05-26-2023

@SowmyaP There seems to be a lot of missing details needed from your use case description. The GenerateFlowFile processor is used to create a FlowFile (it can be configured to produce a FlowFile with specific content and specific custom FlowFile Attributes as well). So from your use case description, you are adding custom text to your GenerateFlowFile processor and want to validate that the custom added text is correct later in your dataflow? Biot confused on the validation since you are defining the format in the GenerateFlowFile. Where do you expect that text to get changed in your dataflow(s) thus requiring your to validate it? I am not clear on yoru end-to-end use case. Processors like the following can be used to validate FlowFiles: 1) RouteOnAttribute - could be used to evaluate a NiFi Expression Language (NEL) against the value of an attribute on a FlowFile and the route to a dynamic relationship. That dynamic failure relationship or unmatched relationship depending on your choice of implementation could be passed to an updateAttribute processor where you generate the log exception text. Then route to a logAttribute processor that could be configured to produce an ERROR log output line to the nifi-app.log reporting your exception. 2. RouteOnContent - Similar strategy as RouteOnAttribute except here you create dynamic properties that use java regular expressions that are evaluated against the FlowFIle's content instead of a FlowFile's attribute to route FlowFiles to a dynamic relationship. rest of dataflow design is same as above. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

SAMSAL · ‎04-12-2023

hi, You can take a look into ForkEnrichment and JoinEnrichment processors : https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.20.0/org.apache.nifi.processors.standard.JoinEnrichment/additionalDetails.html If that helps, please accept solution. Thanks

Online	Offline
Last Visited	‎01-10-2024 01:27 PM

Member Since	‎04-12-2023 07:33 AM
Last Visited	‎01-10-2024 01:27 PM
Posts	7

Cloudera Community

Re: Issue while restarting Nifi

Data Insertion taking long time from Nifi to Singl...

Re: Error while converting excel to csv

Re: Restart, start, stop nodes

Re: GenerateFlowFile Validations

Re: Joining a CSV file with database table