Member since
07-30-2019
3387
Posts
1617
Kudos Received
999
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 118 | 11-05-2025 11:01 AM | |
| 371 | 10-20-2025 06:29 AM | |
| 511 | 10-10-2025 08:03 AM | |
| 353 | 10-08-2025 10:52 AM | |
| 393 | 10-08-2025 10:36 AM |
02-18-2025
12:46 PM
@fy-test Welcome to the community. No matter which NiFi node you are connected to any change request must be sent the elected "Cluster Coordinator" that replicates that request to all connected nodes. If any of the nodes that has been requested to make the change fails to respond in the node will get disconnected. The elected "Primary node" is the node on which any primary only scheduled processor components will run. It is also important to understand that which node is elected as the "Primary" or "Coordinator" can change at any time. I don't think forcing all your users on to the Primary node is going to solve your issue. Even with a node disconnection caused by a failure of the request replication, the disconnected node should attempt to reconnect to the node and inherit the cluster flow if is it different from the local flow on the connecting node. You should also be looking at things like CPU load average, Heap usage, and Garbage collection stats on your primary node versus the other nodes. Perhaps adjust max timer driven thread pool sizes or adjusting timeouts would be helpful. Cluster Node Properties How well are your dataflow designs distributing the load across all nodes in your cluster? Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-18-2025
11:31 AM
1 Kudo
@Jaydeep Welcome to the community: Sharing the exact error and stack trace you encountered would be very helpful to get a better response in the community. Did you follow all the steps to setup for using this Bundle Persistence Provider? S3BundlePersistenceProvider https://central.sonatype.com/artifact/org.apache.nifi.registry/nifi-registry-aws-assembly/1.28.1/versions https://repo1.maven.org/maven2/org/apache/nifi/registry/nifi-registry-aws-assembly/1.28.1/ Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-18-2025
11:13 AM
1 Kudo
@mridul_tripathi That is not exactly the dataflow I was trying to convey, but good attempt. This is what I was envisioning: It start with fetching the of files from "SFTP1" using the listSFTP and FetchSFTP processors. The ListSFTP processor will create a bunch of FlowFile attributes on the output FlowFile that can be used by the FetchSFTP to fetch the content and add it to the FlowFile. In the FetchSFTP processor you will specify the SFTP1 Hostname, Username, and Password. You will use NiFi Expression language to tell FetchSPT to fetch the specific content based in the FlowFile attributes created by ListSFTP: Next the FlowFile (now with its content from SFTP1) is passed to the CryptographicHashContent processor that will create a new FlowFile Attribute (content_SHA-256) on the flowFile with the content hash. Unfortunately, we have no control over the FlowFile attribute name created by this processor. Next The FlowFile is passed to an UpdateAttribute processor is used to move the (content_SHA-256) FlowFile to a new FlowFile attribute and remove the content_SHA-256 attribute completely so we can calculate it again later after fetch same file from SFTP2. I created a new FlowFile Attribute (SFTP1_hash) where I copied over the hash. Clicking the "+" will allow you to add a dynamic property. Next I pass the FlowFile to ModifyBytes processor to remove the content from the FlowFile. Now it is time to fetch the content for this same Filename from SFTP2 by using another FetchSFTP processor. This FetchSFTP processor will be configured with the hostname for SFTP2, username for SFTP2, and password for SFT2. We still want to use the filename from the FlowFile to make sure we are fetching the same file contents from SFTP2. So you can still use "${path}/${filename}" assuming both SFTP1 and SFTP2 use the same path. If not, you will need to set path manually (<some SFTP2 path>/${filename}). Now you pass the FlowFile to another CryptographicHashContent processor which will have the content fetched from SFPT2 for the same filename. At this point in time your FlowFile has a bunch of FlowFile attributes (including hash of both content from SFTP1 (SFTP1_hash) and SFTP2 (content_SHA256)and only the content from SFTP2. So you'll pass it Now it is time to compare those two hash attribute values to make sure they are identical using an RouteOnAttribute processor. Here will create a NiFi Expression Language (NEL) expression to make this comparison. Clicking the "+" will allow you to add a dynamic property. Each Dynamic property added in this property becomes a new relationship on the processor. ${content_SHA-256:equals(${SFTP1_hash})} This NEL will return the value/string from FlowFile's "content_SH256" attribute and check to see if it is equal to the value/string from the FlowFile's "SFTP1_hash" attribute. If true, the FlowFile will be routed to the new "Content-Match" relationship. If false, it will be routed to the exiting "unmatched" relationship. Here you can decide if just want to auto-terminate the "Content-Match" relationship or do some further processing. The Unmatched relationship will contain any FlowFiles where the content for two files of the same filename have content that did not match. The FlowFile will contain the content from SFTP2. Hope this helps. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-18-2025
05:59 AM
1 Kudo
@MarinaM Welcome to the Cloudera Community. Your Questions: Does NiFi allow segregation of data sources? - Would need a bit more information from you on this requirement/question. You can certainly create non connected dataflows on the NiFi canvas for handling of NiFi FlowFiles from various sources to keep them separate. However on the NiFi backend their is no segregation of content. NiFi stores the content of 1 too many FlowFiles in content claims. Anytime new content is created, it is written to the currently still open content claim no matter where in any of the dataflows that content is created. Content written to a content claim is immutable (can't be modified once written), so anywhere in yoru dataflow where you may make modification to a FlowFile's content would result in the new modified version of the content being written to a new content claim. Handling LEEF logs: logs reach NiFi with the original source IP but leave NiFi with NiFi’s source IP. - Would need details on how you have these logs arriving at NiFi and being ingested. Since QRadar first looks for the hostname in the payload (and if absent, uses the source IP), this could cause misidentification. - I am not familiar with QRadar, but perhaps you can modify the content when the hostname is missing in the payload via your NiFi dataflow(s)? Can NiFi be configured to retain the original source IP while forwarding logs, without modifying the original log (to comply with legal requirements)? - NiFi is a data agnostic tool and does not differentiate logs form any other content it is processing. Content in NiFi is just bytes of data and it becomes the requirement of any individual processor that may need to interact with the content of the FlowFiles to understand the content format. Would need to understand how you are ingesting these logs into NiFi. Some processor may be creating FlowFile attributes containing the source IP information which perhaps you can use later in your dataflow? Perhaps another option is to build yoru dataflow to do lookups on the source IP and modify the syslog header when the hostname is missing? Log Integrity & Authenticity: Does NiFi ensure log integrity and authenticity for legal and compliance purposes? - As mentioned for NiFi is data agnostic and content claims are immutable. Once a log is ingested the dataflow(s) you build can modify content if designed to do so and that modified log content is written to a new content claim. Some Processors that modify content may create an entirely new FlowFile with that content referenced in it, but other may just modify the existing FlowFile to point and new modified content in the new content claim while keeping the original FlowFile identifier. Typically this is the case with processor that have an "Original" relationship where the unmodified original FlowFile routes to this relationship while the modified content is assigned to an entirely new FlowFile which becomes a child FlowFile or that original. LEEF Parsing: Is there a NiFi processor available to parse LEEF logs before storing them in HDFS? - Based on this IBM doc on LEEF (https://www.ibm.com/docs/en/SS42VS_DSM/pdf/b_Leef_format_guide.pdf), the LEEF logs consists of a RFC 5424 or RFC3164 formatted syslog headers which can be parsed by NiFi Syslog processors. ListenSyslog ParseSyslog ParseSyslog5424 PutSyslog- Perhaps using PutSyslog instead of PutTCP can solve your Source IP issue you encounter by using PutTCP. There are also controller services that support these syslog formats: SyslogReader Syslog5424Reader Please help our community grow and thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-18-2025
05:20 AM
1 Kudo
@pavanshettyg5 You will get much better traction in the community by starting a new community question rather then adding on to an existing community question that already has an accepted solution. This particular community question is from 2017 Please start a new community question with the details of what you are observing. Thank you
... View more
02-14-2025
08:43 AM
@hus Thank you for the clarification on your use case. The only purpose built processor NiFi has for appending lines to an exsisting file is the putSyslog processor. But it is designed to work with syslog formatted messages being sent to a syslog server for RFC5424 and RFC3164 formatted messages and can't be used to append directly to a local file. However, your use case could be solved using the ExecuteStreamCommand processor and custom script. The ExecuteStreamCommand processor passed a FlowFile's content to the input of the script. Example: I created the following script which I placed on my NiFi node somewhere where my NiFi service user has access. Gave my NiFi service user execute permissions on the bash script (I named it file-append.sh) #!/bin/bash
STD_IN=$(</dev/stdin)
touch $1/$2
echo "$STD_IN" >> $1/$2 This script will take stdin from the executeStreamCommand processor which will contain the content of the FlowFile being processed. $1 and $2 are command arguments i define in the ExecuteStreamCommand processor which I use to dynamically define the path and filename the content will be appended to . It then takes FlowFiles content and either start a new file or append to a file if it already exists with the passed filename. You can see that I set my two command arguments by pulling values from the "path" and "filename" NiFi FlowFile attributes set on the FlowFile being processed. With this dataflow design you can append lines to various files as needed. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-13-2025
01:22 PM
@OfekRo1 I looked at the StandardProvenaceEventRecord source in Github plus I know many of the open source contributors. 😀 https://github.com/rdblue/incubator-nifi/blob/master/commons/data-provenance-utils/src/main/java/org/apache/nifi/provenance/StandardProvenanceEventRecord.java Your welcome and thank you for being part of the community!
... View more
02-13-2025
09:56 AM
@mks27 What you are trying accomplish is not possible in NiFi. In my 15 years of working with NiFi, I believe this is first time I have seen such a request. So what you are expecting to happen is NiFi presents the login window and a user supplies a username and password. You then expect NiFi to attempt authentication via one ldap provider and if that results in unknown username or bad password response, move on to next ldap provider an attempt again? The users that will need access to your NiFi don't all exist in just one of your ldaps? I suppose if you have a multi node NiFi cluster setup, you could configure the ldap-provider on one node to use one of the ldap servers and the ldap-provider on another node to use the other ldap server. Since the NiFi cluster can be accessed from any node, you would just need make sure your users access the NIFi cluster from the appropriate node that is configured with their ldap server. NOTE: Authorization (happens after successful authentication) need to be identical on all nodes in a cluster, but that is not an issue here. You'll just configure the authorizers.xml so that all user and group identities from both ldaps are authorized appropriately. This bootleg way of facilitating authentication via multiple LDAPs, is not something I have ever tested/tried, but believe would work. You could also raise an improvement jira in Apache Jira NiFi project to see if the community might be interested in implementing this change, but I don't anticipate there is much demand for it. https://issues.apache.org/jira/browse/NIFI Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-13-2025
07:52 AM
@hus So what i am understanding is you do not want to overwrite the existing file, but rather update an existing file. Is my understanding correct? Can you share more detail or example of what you are trying to accomplish? Are you looking for a way to search the content of an existing file for a specific string and the replace that string with a new string? NiFi processors are designed to perform work against FlowFiles contained within NiFi, but there are processors that can be triggered by a FlowFile that can run a script against files outside of NiFi. You could also ingest the file, modify its content and the write out the newly modified file. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
02-13-2025
06:26 AM
@mks27 What you are trying to do above is not possible in Apache NiFi. Apache NiFi only supports defining one login identity provider nifi.security.user.login.identity.provider It does not support a comma separated list of multiple login providers, so what is happening is NiFi is expecting to find a login provider in the "login-identity-providers.xml" file with: <identifier>ldap-provider-1, ldap-provider-2</identifier> which does not exist and thus the error you are seeing. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more