Member since
07-30-2019
2440
Posts
1284
Kudos Received
689
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
211 | 01-17-2023 12:55 PM | |
68 | 01-12-2023 01:30 PM | |
122 | 01-12-2023 12:52 PM | |
108 | 12-20-2022 12:06 PM | |
327 | 12-16-2022 08:53 AM |
10-24-2022
09:13 AM
@D5ha Sometimes it is useful to know more about your environment to include the full NiFi version and java versions. Since it is reporting issues as loading the flow: java.lang.Exception: Unable to load flow due to: java.util.zip.ZipException: invalid stored block lengths
at org.apache.nifi.web.server.JettyServer.start I would lean towards some issue/corruption of the flow.xml.gz and/or flow.json.gz on this node. Since all nodes run the same exact copy of these files, I'd copy them from a good node to the node failing to start. Depending on your NiFi version you may not have a flow.json.gz file (This format was introduced in the most recent versions). If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-24-2022
08:59 AM
@MrBurns You want to take the URL that written to the FlowFile's attribute " http.request.uri" and generate a Json, correct? Where do you want to write that JSON (a new FlowFile attribute? Content of the FlowFile?)? There are multiple ways to handle this. If you just want to write JSON to a new FlowFile Attribute, you could use the "Advanced" UI of the UpdateAttribute by setting up a rule for each url type. If you want to write to the content of a FlowFile, you could follow the above UpdateAttribute with a replaceText processor that does an "always replace" to write the json from the attribute to the content of the FlowFile. another option here is to use a RouteOnAttribute to route each url type to a unique ReplaceText to handle the specific url type. I like first option since you can easily add new rules to the UpdateAttribute if any additional URL types are introduced without needing to modify the rest of your dataflow. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-24-2022
07:55 AM
@PriyankaMondal I don't recommend using the NiFi Embedded Zookeeper (ZK). It makes things easy, but not an ideal solution for production. ZK requires a quorum of 3 nodes minimum. With NiFi configured to to use the embedded ZK, this would require your NiFi cluster to have at least 3 nodes. Without a quorum ZK cannot perform its required role. ZK is used to elected the NiFi cluster required cluster coordinator and primary node roles. Also when using embedded ZK, even with 3 NiFi nodes, the ZK won't achieve quorum until all three nodes are up and then you'll see messages like you shared until ZK cluster has formed and quorum established. Your cluster can also break (lose access to UI) if you lose nodes (NiFi shutdown or dies) because you also end up losing the embedded ZK and thus quorum is lost. I suggest going to each of your 3 NiFi servers Svxxx.xyz.com (1), Svxxx.xyz.com (2) and Svxxx.xyz.com (3) to make sure that ZK started and is listening on port 2181. I am assuming these are really three different hosts with unique hostnames and not that you tried to create 3 ZK on one host. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-24-2022
07:35 AM
@D5ha Your issue here is with the certificate being used to perform the clientAuth action. Your certificate would also not work if you had a multi-node cluster. It is only working as single node cluster because there are no other nodes for which your single node need to communicate as a client. The keystore requirements for NiFi are as follows: 1. keystore MUST contain ONLY one PrivateKeyEntry 2. PrivateKeyEntry MUST have both clientAuth and ServerAuth ExtendedKeyUsage (EKU) 3. PrivateKeyEntry MUST have a SubjectAlternativeName (SAN) entry that matches the NiFi node's server hostname. If you are also going to be addressing your server by its IP, you should have that IP as a SAN entry as well. Any other alternative hostname this server may be known as (meaning user type that alternate hostname in a URL to reach this host) should also be added to SAN. In your case, the current issue happens in the mutual TLS handshake. You have configured your SiteToSiteBulletinReportingTask to send to https://<some ip>/nifi. The same NiFi server receive that client hello and responds with a server hello back which includes the SAN entries. In your case the client (reporting task) looks at that server hello and basically rejects the handshake at that point in time. It does this because of what looks like a man-in-the-middle attack. The client tried to reach host <some ip> but instead a host with san <localhost> responded. There is no configuration change you can make in your secured NiFi to get around this. You'll need to get a new certificate meeting that above min criteria I outlined. You'll need t do this also if you ever intend to add more hosts to your NiFi cluster. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-21-2022
01:25 PM
@DGaboleiro I am a bit confused by yoru dataflow design. In a NiFi multi-node cluster, each node is only aware of and can only execute upon FlowFiles present on that one node. So in your Dataflow you have the QueryCasandra processor executing on "primary node" only as you should (having it execute on all nodes would result in both your nodes performing same query and returning same data). You then Split that Json and use a DistributeLoad processor for what appears to me as means to then send some FlowFIle to node 1 and other half to node 2. This is not the best way to do this. You are running Apache NiFi 1.17 which means that load balanced connections are possible that can accomplish the same without all these additional processors. https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#settings After your FlowFiles (this is what is being moved from processor to processor on your canvas) have been distributed I see that you are a MergeContent processor. The MergeContent processor can only merge the FlowFiles present on the same node. It will not merge FlowFiles from multiple nodes to a single FlowFile. So if your desire is to have one merge of all FlowFiles, distributing them across multiple nodes will not give you that desired outcome. You should never configure any processor that accepts an inbound connection for "primary node" only execution. This is important since which node is elected as primary node can change at anytime. Execution strategy has nothing to do with the availability of FlowFiles on each node on which to execute. What is important to understand is that each node in yoru NiFi cluster has its own copy of the Flow, its own set of Content and FlowFile repositories contain unique data, and each nodes executes the processors in its flow with no regard of the existence of other nodes. A node is simply aware from Zookeeper if it has been elected as the cluster coordinator and/or primary node. If it is elected primary node, it will execute "primary node" and "all nodes" components. If it is not the primary node, it will only execute the "all nodes" components. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-21-2022
12:56 PM
@rangareddyy What is important to understand is that the NiFi component processors are not being executed by the user authenticated (assuming secured NiFi) into NiFi, but rather by the NiFi service user. So let's say that your NiFi service is owned by a "nifiservice" linux account. Whatever umask is configured for that user will be applied to directories and files create by that user. Now if your script is using sudo, it is changing the user that executes your script resulting in different user ownership and permission from the "nifiservice" user. Subsequent component processors will also execute as the "nifiservice" user and then not have access to those files and directories. So you'll need to take this in to account as you built your scripts. Make sure that your scripts are adjusting permissions on the directory tree and files as needed so your "nifiservice" user or all users can access the files needed downstream in your dataflows. So in yoru case it sounds like your script executed by ExecuteScript processor is creating a sh file not owned by the "nifiservice" user or does not have execute permission set on it. The ExecuteStreamCommand processor will attempt to execute the sh command on disk as the "nifiservice" user only. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-21-2022
12:41 PM
1 Kudo
@Jagapriyan As a daily job, i may suggest you tackle this differently. You know your source files are written between 8am - 9am each day. So i would configure your listSFTP to run on a cron schedule so it runs every second from 9am-10am to make sure all files are listed. Then knowing that your files may number 90+ (unknown on max) , I would configure your "Min Num of Entries" to some value you know the count will never reach. Make sure "Max Num Entries" is set to a value higher than that. Then configure the "Max Bin Age" to some time 30 mins? What this does is allow MergeContent to continue to allocated FlowFiles to a bin for 30 minutes at which time the bin is forced to merge even if the min value has not be reached. Doing this makes sure you get only one FlowFile out per bin per node. That single FlowFile can then be used to trigger your putEmail used for notification. Additionally, the merged FlowFile will have an attribute " merge.count" added that you can use in your email body to report number of FlowFiles that were ingested. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
10-21-2022
12:28 PM
@Fredi A screenshot of the configuration of your UpdateAttribute processor including main configuration and configuration in the "Advanced" UI would be very helpful in understanding your setup and issue. Thanks, Matt
... View more
10-21-2022
12:23 PM
1 Kudo
@DGaboleiro That is not me as the assignee to jira https://issues.apache.org/jira/browse/NIFI-8043. But that Matt is an awesome guy @mburgess. Thanks, Matt
... View more
10-21-2022
12:21 PM
@RRosa I am not clear on what you mean by " migrating the flow files"? A NiFi FlowFile is the object that is traversed via connections between NiFi component processors on the NiFi canvas. Are you talking about migrating your actively queued FlowFiles from NiFi cluster 1 (Apache NiFi 1.12.1) to NiFi cluster 2 (Apache NiFi 1.17.0)? Or are you talking about migrating the flow.xml.gz file (contains everything you have configured on the canvas of yoru NiFi) from old cluster to new? General guidance for upgrading Apache NiFi can be found in the admin guide here: https://nifi.apache.org/docs/nifi-docs/html/administration-guide.html#upgrading_nifi The only thing I see NOT covered in that guidance is the preservation of component state. Within a cluster Component state may be stored depending on component in either a local state directory on each node (each node holds only state for that node) or stored in cluster state (written to ZK and shared across all nodes). Now if you are installing the new version of NiFi on the same hosts where the old NiFi nodes were running, simply preserve the state configuration and the new nodes when started with a copy of the flow.xml.gz will continue to read and use same state. Same goes for new nodes using the same external ZK that the previous nodes used (stop old before starting new hosts). While the Documentation recommends that you process out all queued FlowFiles from cluster 1 before starting cluster 2 that is not required. If new nodes point to same content, flowfile, and provenance repositories as previous node that data will get loaded back in on startup and processing continue where it left off. Remember that each nodes repositories are unique to that node (meaning you can't combine them and they don't all contain the same content). Another thing to review is the release notes. https://cwiki.apache.org/confluence/display/NIFI/Release+Notes You'll want to review all the release notes between 1.12 and 1.17. Apache NiFi is known to deprecate and remove some components (processors, controller services, reporting tasks, etc) from time to time. You'll want to check to see if any components you use in your current dataflows are being removed. Additional some components may have changed typically resulting in additional properties being added. When you start with the newer version of NiFi, it will load your existing flow.xml.gz (1.17 will actually generate a flow.json.gz file from your flow.xml.gz) and upgrade all your components to use the newer 1.17 version of the component classes. So you'll want to review you flow after upgrade to make sure none of your components that were previous valid have become invalid because new property exist that must be configured. NOTE: 1.17 will use start using the flow.json.gz once upgrade, as the flow.xml.gz format is deprecated. If you found this response assisted you with your query, please take a moment to login and click on "Accept as Solution" below this response. Thank you, Matt
... View more
10-19-2022
08:25 AM
@orekxl @biblio_gr The following community article will help you understand what really happens when a user chooses to click on "terminate" on a stopping NiFi processor with active threads" https://community.cloudera.com/t5/Community-Articles/Understanding-NiFi-s-quot-Terminate-quot-option-on-running/ta-p/355433 If you found this assisted you with your query, please take a moment to login and click "ACCEPT as Solution" below this response. Thank you, Matt
... View more
10-19-2022
08:20 AM
2 Kudos
The intent of this article is cover exactly what happens when a user clicks the "terminate" button on a processor component that has an actively running task. Before we can discuss the "terminate" option, we need to understand a few basics about the NiFi application and a bit of history: 1. NiFi is a java application and the execution of any component (processors, controller service, reporting tasks, funnel, input/output ports, etc) happens within that single Java Virtual Machine (JVM) process. NiFi does not create a child process fro the execution of each component. 2. Since NiFi operates within a single JVM, it is not possible to "kill" a thread for an individual component without killing the entire JVM. 3. NiFi consists of well over 400 unique components and many of them are not executing native NiFi code. Many use client libraries not managed or controlled by NiFi. Others can be configured to execute command external to NiFi (ExecuteStreamCommand, ExecuteProcess, ExecuteScript, etc). Processors that invoke something external to NiFi's code base will result in a child process being created with its own pid. Keep in mind that processors of this type do not limit what externally is being invoked so take a generic approach to handling those child processes. The JVM invokes the external command and waits for it to respond complete. 4. Historically NiFi did not offer a terminate option since killing a thread in the NiFi JVM wis not possible. So when a component misbehaved (usually do to an issue external to NiFi code like network, client library hung, external command hung, etc), that NiFi component processor would get stuck just with the JVM thread waiting on that client library or external process to return. As such, the processor's concurrent task JVM thread is blocked. While you could select to stop the processor that would not help users get past the hung or long running thread. NiFi processors transition to a "stopping" state where it will remain until that library or task it is waiting on completes. Until that happens, users would not be able to modify the configuration or restart the component. This meant for truly hung issues the component would be blocked until the NiFi JVM was restarted. 5. As a result of the inconvenience/impact a hung thread causes, NiFi introduced the "terminate" option on a "stopped" component with an active thread. What Actually happens when a user clicks "terminate": 1. "Terminate" is only possible when after a processor has been asked to stop and that stopped processor still has associated JVM thread running. 2. Since we know that killing a JVM thread is not possible without killing the entire JVM process (NiFi), the "terminate" option takes a different approach. When a processor executes, it is doing so typically in response to inbound queued FlowFile as the trigger. That means the inbound FlowFile is tied to the JVM thread that is executing. When the thread completes, that FlowFiles (or modified, cloned, new FlowFile depending on processor function) is moved to the appropriate outbound relationship of the processor. 3. So what the "terminate" function really does is releases the FlowFile associated to that running JVM thread back to the inbound connection, makes request to client library or external command to abort/exit, and then isolates that thread so that if it does actually complete post terminate, all returns are just sent to null. 4. When "terminate" has been selected, the UI will render the processors active threads differently to indicate if the processor has JVM threads that have been terminated but are still active. NOTE: The number within the parenthesis denotes the current number of terminated threads still active. 5. If the client or external command responds to the request to exit, the active "terminated" thread will disappear. If not, it will continue to exist until thread finally completes or the entire NiFi JVM is restarted. NOTE: A terminated thread has little impact on resources since a hung thread isn't consuming CPU. Now a long running CPU intensive thread may have impact. 6. Now that this "terminate" JVM thread has been isolated and any FlowFile(s) tied to that thread have been released to originating connection, users can modify the processor configuration and start the component processor again. When started again, the processor will execute again on the FlowFile(s) that once belonged to the terminated thread. So no dataloss is incurred as a result of using "terminate". The "terminate" capability allows users to move on without needing to restart their NiFi JVM, thus reducing downtime and impact to other dataflows running on the NiFi canvas. If you have a processor that constantly has hung process issues or has very long running threads, it is time to start looking at your source FlowFile(s), processor configuration, external command, or external service the processor may be waiting for a response from as possible sources of the issue. Reference: Apache NiFi Terminate documentation
... View more
Labels:
10-03-2022
05:54 AM
1 Kudo
@leandrolinof I see no reason why using UpdateAttribute to establish the needed path and filename values for FetchSFTP processor would not work. FetchSFTP has no dependency on using ListSFTP. ListSFP just serves as a mechanism for obtaining a list of Files from a target SFTp server and recording state. ListSFTP simply creates a FlowFile with the needed attributes set for each File found on the target. So if you have another method build that can get set those attributes, then you are good to go. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-30-2022
02:44 PM
@Jagapriyan Your described flow above does not mention the mergeContent processor which is what would be needed to merge multiple FlowFiles with matching attributes values into 1 output FlowFile. Share your MergeContent processor configuration. Additionally the ListSFTP processor does not download the content of the files form the remote server. It is only used to list the files on the remote server and set attributes on the FlowFile that would be used by the FetchSFTP processor to actually download the content. How do you know when you have all the files for a given state? Is this a continues feed of files? Is this a daily job? While file count is different per state, is count same per state? What is the highest count and lowest count? Thanks, Matt
... View more
09-30-2022
02:35 PM
@Kushisabishii What are you seeing in the nifi-user.log when you make this import attempt? You may be getting the 403 because the user is not authorized properly to perform the import call. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-30-2022
02:31 PM
@KD9 How long the NiFi server will validate a clients token is configured within the login-identity-providers.xml file via the following property: Authentication Expiration When setting up an automated process, using client tokens is not the best method. A better option would be to authenticate your client via a client certificate. With a client certificate, there is not need to obtain a token. That Client certificate will continue to work for the life of the certificate (certificates do have a valid until date set when you generate the certificate). So instead of passing a bearer token in your curl command, you would use your client pem key. The owner DN from the client certificate would be used as the user identity that you would then need to authorize in NiFi for the rest-api endpoint(s) needed for your automation. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-30-2022
02:19 PM
@John-MaxQ The MergeContent processor utilizes the Apache Commons Compress library which has a hard limit in tar size. There is an existing Apache NiFi jira for this here: https://issues.apache.org/jira/browse/NIFI-10273 If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-30-2022
02:13 PM
@leandrolinof ListSFTP and FetchSFTP are commonly used to fetch files from a remote server. In order to use ListFile or FetchFile the location remote directory would need to be mounted as a local file directory. Matt
... View more
09-30-2022
02:09 PM
@PepeClaro Can you share your processor configuration and NiFi version being used? Thanks, Matt
... View more
09-30-2022
02:04 PM
@ImranAhmed If I am understanding you correctly, you are trying to verify the content has correctly been modified before it is written out via your invokeHTTP processor.Inspecting that your dataflow is working as expected. If you stop your InvokeHTTP processor after the ReplaceText processor, the FlowFile processed by the ReplaceText will queue in the connection. You can then right click on the connection and list the queue. From there you can view/download the content of that FlowFile post ReplaceText to verify it contains the expected modified content before the InvokeHTTP is started and writes it to your DB. Thanks, Matt
... View more
09-26-2022
03:36 PM
@ImranAhmed I am not clear ion what you are trying to do here. You are looking for some exact string and replacing with nothing? The search value is a java regex so what little i can see is looking for an exact string match which is being compared against the "entire text". So entire text means entire contents of your FlowFile is being loaded into NiFi's JVM heap also. Why "entire text" instead of "line-by-line"?
... View more
09-26-2022
03:29 PM
1 Kudo
@knighttime You should not be configuring any of your NiFi processors to us the Event Driven scheduling strategy. It was not moved from an experimental method to production ready. Advances in the Time rDriven scheduling strategy has made int more efficient. So Event Driven is pretty much deprecated at this point in time. If you are not using Event Driven scheduling on any processor component in your NiFi, You should not be setting a large "Maximum Event Drive Thread Count" pool (default is 5, but i recommend setting to 1). While you can increase the Maximum while NiFi is running, reducing will require you to restart your NiFi. Now when it comes to the "Maximum Timer Driven thread count" pool. We can create a large pool which is per node in your 4 node cluster (80 thread X 4). Then you configure concurrent tasks on your individual processors to scale concurrency on each processor component. Also keep in mind that many processors execute to check connection inbound connection queues and those thread may only be active for micro seconds before being released back to the thread pool. so actually seeing full thread utilization represented in the status bar of your NiFi's may be difficult to see. Tips about concurrent task setting on processors. Setting high concurrent tasks configuration across many processors can be worse than leaving everything set at 1 in terms of overall performance. Start in the basement (1 concurrent task) and slowly increment concurrent tasks on processors as needed. You mention data queueing up, but it is difficult to tell you why or provide guidance without seeing your dataflow and knowing which processors have data backed up to them in their inbound connections and the configuration of those processors. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-22-2022
11:04 AM
@wasabipeas I can think of no way to force delete if it is blocking on a revision mismatch between nodes. Nothing here has anything to do with version control. Is it always the same node reported in the pop-up message that fails to process the request? If so, have you verified the libs and version running on that one node match rest of cluster? If you go to the cluster UI and select "VERSIONS" tab, they all reflect same version? You could manually disconnect the one node that it keeps complaining about from the "NODES" tab. After it is disconnected, you could delete it from the cluster (Deleting the nodes does nothing flows or data on that node. It will require a restart of that one node to get it to rejoin cluster). Once the node is removed form your cluster (temporarily), your cluster should reflect 10/10 connected nodes now in the status bar of the canvas UI. Check to see if your are still having revision issues with the process after reloading the page. If all looks good, you could access the filesystem of the currently disconnected and deleted node, stop the NiFi service on that node, and delete/rename the flow.xml.gz and flow.json.gz files. Then start this node again. On startup, NiFi will inherit the flow from the cluster and in doing so get the cluster flows current revision for the problematic process group. If problem persists, restart node that was deleted so that it rejoins the cluster. Then disconnected the currently elected cluster coordinator. A new cluster coordinator will then be elected by zookeeper. Check to see if issue with process group is resolved. Reload your browser to force a page refresh. If issue is resolved, rejoin node to cluster via the cluster UI to see if issue returns. If so, we at least know which is our problematic node. You can of course, disconnect, delete, rename flow.xml.gz and flow.json.gz, and then restart node, just as we performed before so that flow is pulled from cluster on startup. If issue still persists, there is something unique about this node. Disk space ok?, any exceptions in logs?, while node may report same NiFi version, something different with contents of lib(s) folders (get a checksum and compare against other nodes). Hope this helps without needing to restart entire cluster, Matt
... View more
09-22-2022
06:53 AM
@prparadise The NiFi MergeRecord assigns queued FlowFile on inbound connections to bins. Bins can only contain "like FlowFiles". In order for two FlowFiles to be considered 'like FlowFiles', they must have the same Schema (as identified by the Record Reader) and, if the <Correlation Attribute Name> property is set, the same value for the specified attribute. Initial thoughts: 1. Perhaps your source FlowFiles are resulting in unique inferred schemas. The XMLRecordSetWriter can be configured with a schema write strategy such as "Set 'avro.schema' attribute" so that the output merged FlowFile has the schema added to an attribute (this would allow you to see the inferred schema on multiple FlowFiles to see if they match. 2. The min number of records per bin is set to 1 still. When the Merge type processors execute that look at an inbound connection and allocate queued FlowFiles to bin(s). At end of binning, it will see if any bin is eligible for merge. This processor can execute very fast and frequently. Let's say that each time it executes, the inbound connection only contains 1 FlowFile. Since min records per bin is 1, a bin with only one FlowFile would get merged. Try setting the min records to a higher value. Whenever you change the "min" settings, you should also set the "Max Bin Age" property. This forces a bin to merge after the configured amount of time even if min values are not met. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-22-2022
06:33 AM
@Chhavi Your question is not very clear. is there any way I can integrate clustered nifi for storing and fetching the value Storing and fetching what value? NiFi includes a couple Redis controller services: RedisConnectionPoolService RedisDistributedMapCacheClientService For example, The RedisDistributedMapCacheClientService could be used by the PutDistributedMapCache NiFi processor to write values to Redis. The FetchDistributedMapCache NiFi processor could be used to fetch values from Redis. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-22-2022
06:24 AM
@ImranAhmed Can you share a screenshot of your dataflow and the configuration of your replaceText processor? You mention xml. Is your source File that you are trying to perform replace text on an XML format file? If so, that is not a text based (ASCII) content file. In that case, your configured search value is probably not matching on anything and this nothing gets changed in the original content binary content. NiFi is a content agnostic application. This means that NiFi can ingest any type of data. It does this by wrapping that content in to what NiFi calls a FlowFile. A FlowFile consists of two parts: 1. FlowFile metadata/attributes - Stored in the NiFi flowfile_repository. It contains details about the content such as filename, size of content, location of stored content, and any other attributes added by NiFi components (processors, controller services, etc) as the FlowFile traverse these components in your dataflow. 2. FlowFile content - Stored in claim files with in the NiFi content_repository. NiFi simple writes the binary contents to a claim and records the starting byte location and number of bytes of the content. This way NiFi does not need to be able to read the content to move it through a dataflow. It becomes an individual component's code responsibility for knowing how to read the content of a FlowFile. So NiFi includes processor components for many different data types. As far as XML content files, NiFi has limited native options (SplitXML, TransformXML, ValidateXML, XMLReader, and XMLRecordSetWriter). The latter two are controller services that could be used by processors like convertRecord. There is also the possibility that one fo NiFi's scripting processor could be used where the user writes a script that can read an handle the specific content type. There are execute processors that can execute and external command on the server where NiFi is running against the content of a FlowFile. So if there is an external command service that can take content input and return modified content back. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
09-22-2022
05:32 AM
@wasabipeas What version of NiFi-Registry is being used as well? In your NiFi UI, search for component UUID (a8db3982-1350-1b8b-ffff-fffff988699d). What kind/type of component is it? What is current state of the component (enabled, disabled, running, stopped, enabling, disabling, starting, stopping) Share screenshot of its current configuration. Thanks, Matt
... View more
09-22-2022
05:22 AM
@myzard Did your LDAP manager password contain any XML special characters? Did you verify ldapsearch worked from same same host were NiFi is installed using that manager DN and Manager password to get a return for the user you are trying to login in with? what output did you get from ldap search? For the ldap-provider, there are only two username and passwords in use: 1. Manager DN and Manager password configured in the ldap-provider 2. username and password entered at login interface. Other suggestions: - Make sure there are no leading or trailing whitespaces on the username or password configured in the provider or entered at the login windows. - Make sure the nifi.properties file is properly configured for the ldap-provider and not a different login provider like kerberos-provider. - Share you ldap-providers.xml file Thanks, Matt
... View more
09-22-2022
05:13 AM
@skoleg Looks like you may have an issue with your self signed node certificates. Can you share the output of your keystore and truststore from both nodes: keytool -v -list -keystore <keystore filename>
keytool -v -list -keystore <truststore filename> I wonder if perhaps you are missing the required clientAuth ExtendedKeyUsage (EKU). Thanks, Matt
... View more
09-20-2022
11:16 AM
@skoleg Something is not configured the same if you are getting different behavior out of each node. Unfortunately, without seeing your configuration files (nifi.properties, login-identity-providers.xml, authorizers.xml, authorizations.xml, and users.xml) and app-logs/user-logs, it would be difficult to provide additional suggestions on your setup. Make sure your NiFI nodes are authorized to proxy user requests, but i'd expect you to get an exception in the UI if they were not already. "Anonymous" happens with no client/user authentication was successful. Thanks, Matt
... View more