Member since
02-01-2022
274
Posts
97
Kudos Received
60
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
438 | 05-15-2025 05:45 AM | |
3470 | 06-12-2024 06:43 AM | |
6102 | 04-12-2024 06:05 AM | |
4174 | 12-07-2023 04:50 AM | |
2241 | 12-05-2023 06:22 AM |
06-21-2023
06:40 AM
@DTM In that case you would need to use the DatabaseConnectionPool and jdbc to aws postgres. This will require permissions to allow nifi network to speak to RDS endpoint. If you cant use DBCP, you will have to put something between the RDS and nifi. For example nifi could use invokeHTTP to send/post data to an ec2 instance with some kind of API that can do the connectivity.
... View more
06-21-2023
06:27 AM
@DTM You are correct, to use an AWS credentials in nifi, you need to use the Controller Service. This controller service is then referenced by processors as a drop down menu for AWS Credentials Provider Service. If one does not exist in the drop down you can chose to create it. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-aws-nar/1.21.0/org.apache.nifi.processors.aws.credentials.provider.service.AWSCredentialsProviderControllerService/index.html An example of one such processor is GetDynamoDB but there are many. Just search AWS in the processor search box to find all aws related processors.
... View more
06-21-2023
06:21 AM
2 Kudos
@drewski7 Your actual bottleneck with the initial cluster size is the limitation of cores and ram. No matter what you do with concurrency, run schedule, the processor config, or other processors is limited by total cores and jvm on 2 machines. The total number of nodes itself is a limitation too. Ideally you want a master node at the top of flow, and its pushing down flowfiles to 2-3-4-5+ nodes to distribute the workload. That division of the workload is where nifi shines and you see massive throughput.
... View more
06-20-2023
06:47 AM
1 Kudo
@wert_1311 Check out this article: https://community.cloudera.com/t5/Community-Articles/Support-Video-Enabling-kubectl-for-CDE/ta-p/314200 Enjoy Breakstone's amazing radio voice! 😉
... View more
06-20-2023
06:42 AM
3 Kudos
@drewski7 This blog is a great place to start: https://blog.cloudera.com/benchmarking-nifi-performance-and-scalability/ That said, some recommendations: Recommend 3 nodes. Use 32 or 64gb ram. Set min ram 16, max 32, let nifi/operating system leverage other 32gb. Add more cores and tune Active Thread Count accordingly Be careful which processors are Primary Only and which processors are not. Do not over loadbalance queues, load balance at top of flow, let nifi distribute work load naturally after that. Tune Processor Concurrency and Run Schedule. Be sure to understand how each work. With a good setup tuned as above, have a plan to identify when time is appropriate to scale horizontally (add more nodes). Here are some more docs that get specific into sizing: https://docs.cloudera.com/cfm/2.1.1/nifi-sizing/topics/cfm-sizing-recommendations.html https://docs.cloudera.com/HDPDocuments/HDF3/HDF-3.5.2/nifi-configuration-best-practices/content/configuration-best-practices.html
... View more
06-20-2023
06:31 AM
@Phil_I_AM You should be able to use InvokeHttp to build any REST api calls. https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.17.0/org.apache.nifi.processors.standard.InvokeHTTP/index.html The approach I recommend is to have a fully working POSTMAN api call with know url, get/post parameters, and required authentication headers. With this working call and required details, work to duplicate the setup in InvokeHttp until operational.
... View more
06-20-2023
06:23 AM
2 Kudos
@JoseRoque HDP downloads are behind a paywall. Additionally, HDP is no longer supported so I highly recommend that you check out CDP. You can still find a HDP Sandbox in docker: https://www.cloudera.com/tutorials/sandbox-deployment-and-install-guide/3.html Pay attention to top section there: As of January 31, 2021, this tutorial references legacy products that no longer represent Cloudera’s current product offerings. Please visit recommended tutorials: How to Create a CDP Private Cloud Base Development Cluster All Cloudera Data Platform (CDP) related tutorials
... View more
06-16-2023
05:39 AM
@nuxeo-nifi Wanted to first make some suggestions to help us better respond: Include a screen shot of your entire flow Include as much detail as possible about how certain parts are completed. For example: how is the CSV processed. Indicate what you have tried or what you see "toward the end of processing" including details of what you expect. For Example: a single update statement w/ fail and success counts, or insert failures into 1 table and errors into another. Not knowing this, we have to make some assumptions that could possibly result in providing an inaccurate solution or turn the post into long drawn out dialouge, versus simple question, and direct answer/solution. Making those assumptions, I could assume at the bottom of your flow, you have a success and failure relationship. One suggestion would be to use (MergeRecord/MergeContent) to obtain the counts, then maybe replaceText to fabricate the counts into correct shape flowfile and route to an ExecuteSQL processor to execute your SQL statements. Another alternative solution could be to send errors and success to separate ExecuteSQL processors in a way that for each flowfile it just executes a SQL statement that increments the existing count. This would save the need to merge and get totals. Maybe like these in each ExecuteSQL: UPDATE table SET success = success +1 WHERE tablename ='something'
UPDATE table SET errors = errors +1 WHERE tablename ='something'
... View more
06-15-2023
09:36 AM
@Ray82 Yes, you can achieve this with UpdateRecord. You will need to provide record reader/writer with schema of your upstream and downstream. Then in UpdateRecord you explicitly add properties (+) for each record value you want to update versus using a SQL statement like QueryRecord. Here are some useful community posts on this topic: https://community.cloudera.com/t5/Community-Articles/Update-the-Contents-of-FlowFile-by-using-UpdateRecord/ta-p/248267 https://community.cloudera.com/t5/Support-Questions/NiFi-UpdateRecord-processor-is-not-updating-JSON-path/m-p/186256
... View more
06-15-2023
08:54 AM
@MOUROU I recently built a nifi flow in version 1.21 that uses the NiFI API from within nifi, and it is NOT necessary to get access token. From within nifi i am able to just start using the api calls I needed. It would be worth it to see if 1.16 behaves the same way. That flow is here: https://github.com/cldr-steven-matison/NiFi-Templates/blob/main/NiFi_Template_XML_to_Flow_Definition_JSON.json
... View more