Member since
07-30-2019
3219
Posts
1589
Kudos Received
935
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
91 | 03-17-2025 12:09 PM | |
133 | 03-11-2025 05:58 AM | |
264 | 03-06-2025 06:05 AM | |
212 | 03-04-2025 06:28 AM | |
244 | 03-03-2025 10:59 AM |
03-04-2025
06:28 AM
1 Kudo
@AllIsWell Welcome to the community. NiFi templates where deprecated long ago in the Apache NiFi 1.x version as well. They were officially completely removed from the product in Apache NiFi 2.x. Flow Definitions which are in json format can be downloaded and uploaded to both newer version of Apache NiFi 1.x and all versions of Apache NiFi 2.x. The rest-api docs cover upload of a flow defintion here: https://nifi.apache.org/nifi-docs/rest-api.html#uploadProcessGroup There are 6 form fields: An example rest-api call would look something like this: curl 'https://<nifi-hostname>:<nifi-port>/nifi-api/process-groups/<uuid of Process group in which flow definition will be uploaded>/process-groups/upload' \
-H 'accept: application/json, text/plain, */*' \
-H 'content-type: multipart/form-data' \
-H 'Authorization: Bearer <TOKEN>' \
--form 'clientId="<uuid of Process group in which flow json will be uploaded>"' \
--form 'disconnectedNodeAcknowledged="false"' \
--form 'file=@"/<path to>/<flow-defintion.json filename>"' \
--form 'groupName="handleHTTPRequestFlow2c"' \
--form 'positionX="361"' \
--form 'positionY="229.5"' \
--insecure I use ldap to authenticate in to my NiFi, so i use the bearer token issues for my user on authentication in the above rest-api call. Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-03-2025
11:58 AM
@ajignacio The PutMarkLogic processor is not a component bundled and shipped with Apache NiFi, so I am not familiar with it. You may want to raise your issue directly with those who developed this connector and include your Apache NiFi version specifics as well: https://github.com/marklogic/nifi/issues Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-03-2025
10:59 AM
@Emery Your first query: One thing I've noticed is that it's not possible to select the MapCacheClientService I created. Components added to the NiFi canvas only have access to the controller services created on Process Groups (PG) on the canvas. Even when you are looking at the canvas presented when you first login, you are looking at the root PG. From the root PG you can create many child PGs. When you use the Global Menu in the UI to access the "Controller Settings", you have the ability to create "Management Controller services". Controller services created here are for use by Reporting tasks and Registry Clients created from with the same "Controller Settings" UI. They are not directly going to be referenced by the components on the dataflow canvas. This is why the MapCacheClientService you created was not see by your PutMapCache and FetchMapCache processors. From within the processor component, you have the option to select existing supported controller service that exist already within the current PG level or any parent level PG (assuming user has proper permissions at those parent levels). It is important to understand the Child PGs inherit policies from parent PGs unless an explicit policy is defined on the child PG. You also have the option to "Create New Service", which you can select even if an available controller service already exists.. If a supported controller service exists, it will be presented in a selectable list when you click on the processor field, so it is NOT necessary to create a separate controller service for each processor. To create a new service you must click on the three stacked dots instead of clicking on the field. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
03-03-2025
09:26 AM
@jirungaray The DistributedMapCacheServer controller service sets up a cache server which will keep all cached objects in NiFi's JVM heap memory. This cache is lost if the controller service is disabled/re-enabled or if NiFi were to restart unless the "Persistence Directory" is configured. The persistence directory is some local disk directory where cache entries are persisted in addition to those cache entries also being in Heap memory. The persistence to disk allows the in memory cache to be reloaded if the cache server is disabled/re-enabled or NiFi is restarted. I assume this is the cache server you are currently using. Matt
... View more
03-03-2025
06:04 AM
@Bern Unfortunately there is not enough information here to understand exactly what is going on. The only exception shared was related to an attempt to terminate a thread on some processor. As far as why you see this, there is not enough information to say. It could be a bug in an older version, could be load issue, could be thread pool exhaustion, etc. Observations and questions: You are running with a very old version of Apache NiFi release 6+ years ago and one of the first releases to offer the Load-Balanced connections feature which was very buggy when it first was introduced. You would greatly benefit from upgrading for security fix and bug fixes reason. You see to be using the load-balanced connections excessively. It makes sense to redistribute NiFi FlowFiles in connections after your executeSQL processors, but i see no value in redistributing after RouteOnAttribute or on the failure connections. This just adds excessive and unnecessary network traffic load. I see you have ~1400 running components and a queue of ~265,000 FlowFiles. What is the CPU load average on each of yoru nodes and how many nodes do you have in your NiFi cluster? What java version is being used? Are Garbage Collection (GC) stats healthy. How often is GC (partial and full) running? How long is spent on GC? Any other ERROR in your nifi-app.log? Have you taken any thread dumps when you are having issues with processor component threads? What did you observe there? Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-28-2025
10:21 AM
1 Kudo
@shiva239 The Schemas are fetched when a FlowFile is processed by the PutDatabaseRecord processor. There is no option to schedule a refresh of the existing cache. There is no rest-api endpoint or options available to flush the cache on demand aside from maybe stopping and starting the processor. But doing so means every schema will be cached again as new FlowFiles are processed by the PutDatabaseRecord processor, so not an ideal solution/work-around. The issue you are having is related with an existing Apache NiFi NIFI-12027 PutDatabaseRecord improvement jira. I suggest you add a comment to this jira explaining your use case and impact this has. Perhaps someone in the community or yourself can contribute to this improvement. Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-28-2025
06:16 AM
1 Kudo
@shiva239 The PutDatabaseRecord processor has a Table Schema Cache Size property that Specifies how many Table Schemas should be cached. This cache is used to improve performance. you could try setting this to 0 from its default 100, but am not sure how this will impact your specific overall performance. Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-28-2025
05:57 AM
@0tto Welcome to the community. NiFi backend does not provide the ability to configure the Provenance Repository to store provenance events in an external DB which can then be accessed via the Provenance UI integration. However, there are couple provenance Reporting tasks available within NiFi that can be used to additionally send provenance event (local provenance still exists) to another destination. AzureLogAnalyticsProvenanceReportingTask SiteToSiteProvenanceReportingTask For sending the provenance events to a DB, building a dataflow on a dedicated NiFi instance via the SiteToSiteProvenanceReportingTask is going to be the option for you. So you would add this reporting task to the NiFi instance/cluster generating the provenance events you want to keep for long term storage. You would setup another NiFi instance/cluster for processing the large volume of provenance events. The Reporting task would be configured to send the provenance events to a Remote Input Port the other NiFi via NiFi's Site-To-Site capability. Admin guide: Site to Site Properties User guide: Site-to-Site Once these provenance events are received by that other NiFi they will become content of FlowFile which you can route via a NiFi dataflow on the canvas and send them to whatever storage destination of your choice. Note: You typically do not want to use the same NiFi where you are using the reporting task to receive and process the provenance events because those received events also will produce provenance events as they are routed through the dataflow, so you would have endless provenance events being produced. Sending to an dedicated provenance NiFi instance/cluster makes sure that your DB contains only the dataflow(s) provenance events of interest. Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-27-2025
11:14 AM
@AlokVenugopal Welcome to the community. What you are encountering is an authorization issue and not an authentication issue. NiFi is accepting your token issued through your application login, but then authorization does not exist for the user identity derived from you token. In NiFi, after successful authentication, the user identity is passed to the NiFi authorizer to determine what NiFi policies have been authorized for that user identity. When using yoru application's token, this result in no authorization found because neither the user Identity or any known groups that user identity belongs to are authorized for the required policy. identity[kLM-4Eld2dZnX_dD3iB0df2fTvXQxa1J2ffdLoK-ozas], groups[] Supporting the user "unique id" would require that NiFi's authorizer contained that unique id and it was authorized to the necessary NiFi policies. Authorizing users based in these unique id does not make much sense in NiFi as it would be error prone and difficult to manage authorization. An Admin would need to know what user these unique ID map to in order to setup authorization successfully. The first option would be modifying your app so that the returned token contain and ID that matches the user identity similar to what NiFi does. Assuming this "unique id" does not change and is always the same for the specific user, perhaps you can work around this creatively within NiFi through group based authorization. This would requiring using the file-user-group-provider within the NiFi authorizers.xml. This will allow you to manual add user identities and group identities. So you create a new group such as "username" via the NiFi UI. You then add your existing user (the one that successfully gets authorized when you authenticate through NiFi) to this new group. You then add a new user identity for that "unique id" and make that new user a member of that same group via the NiFi UI. Now authorize the group to whichever policies are necessary. Now no matter if your user authenticates via NiFi to get token or through your app to get a token, the user will successfully be authorized via the shared group membership. Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-26-2025
12:48 PM
2 Kudos
@jirungaray Welcome to the community. The DetectDuplicate processor does not store anything in NiFi state providers (local state directory or cluster state in zookeeper). The DetectDuplicate processor utilizes a DistributedMapCache Service to store cached items. Depending on the cache service used, those cache service may offer retention configurations for number of cache entries and cache entry persistence. Any NiFi component that retains state will indicate such in its documentation under the "State Management" section. The "Age Off Duration" configuration will age off cache entries that may still exist when that duration is reached, but it can not control the number of cache entries the end service will retain. So the cache service may still be evicting cache entries prior to that configured Age of Duration is reached. Since you mention that your Cache Entries are not being preserved on NiFi restart, I assume you have configured your DetectDuplicate to use the DistributedMapCacheClientService. The DistributedMapCacheClientService is dependent on the existence of a running DistributedMapCacheServer. This DistributedMapCacheServer does in fact hold cache entries within NiFi's Heap memory and unless you have configured a "Persistence Directory", will lose all cache entries on NiFi service stop. The DistributedMapCacheServer also has configuration thresholds for the max number of cache entries it will hod before evicting cache entires based on the eviction strategy configured. This configuration established an upper boundary. Keep in mind the higher the Max Cache Entries setting, the more NiFi heap memory is used which could lead to NiFi experiencing OutOfMemory (OOM) exceptions. Since it sounds like you want to retain a very large amount of cached entries, I'd recommend against using the NiFi internal DistributedMapCacheClientService considering the high heap memory usage it would require and the high likelihood that will impact your NiFi's stability and performance. NOTE: The DistributedMapCacheClientService and DistributedMapCacheServer do NOT offer any form of High Availability. The DistributeMapCacheClientService can only be configured with a single server hostname. While the DistributedMapCacheServer when started does create a running Cache server on all hosts within the NiFi cluster, the cached entries are not shared or replicated across all of them. ONLY the cache server hostname configured in the DistributedMapCacheClientService is used. For HA, you should be using a more robust external to NiFi cache service like Redis. Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more