Member since
07-30-2019
3472
Posts
1642
Kudos Received
1020
Solutions
My Accepted Solutions
| Title | Views | Posted |
|---|---|---|
| 261 | 06-03-2026 06:06 PM | |
| 529 | 05-06-2026 09:16 AM | |
| 1044 | 05-04-2026 05:20 AM | |
| 586 | 05-01-2026 10:15 AM | |
| 703 | 03-23-2026 05:44 AM |
07-26-2024
06:17 AM
1 Kudo
@cadrian90 I agree with @SAMSAL response. Typically the ConvertRecord processor is what would be used here. The processor support numerous record readers and numerous record writers. The GrokReader is what would be commonly used to parse unstructured data like your Cisco syslog messages. While the GrokReader has bulit in pattern file, you may fond yourself needing to define a custom pattern file for your specific data. You might find this other community post helpful here: https://community.cloudera.com/t5/Support-Questions/ExtractGrok-processor-Writing-Regex-to-parse-Cisco-syslog/td-p/233095 Beyond above, this is where it becomes challenging since Apache NiFi only has a CEFReader and no CEFRecordSetWriter (perhaps you can raise an Apache Jira asking for this new reader and someone in the Apache community may be able to help) There does exist a ScriptedRecordSetWriter that if you know how to scripted out the CEF format, maybe you can use that. I really would not be able to help there myself. Maybe you can look into the CSVRecordSetWriter to see if selecting a custom format would facilitate an output like CEF. Again not something I have tried myself. Hope this helps you with your use case journey. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-23-2024
06:25 AM
@mirkom As far as the post you referred to in your original question. It is not accurate. The GetSFTP does NOT accept an inbound connection. The only SFTP ingest processor that accepts and inbound connection is the FetchSFTP processor (which is the processor that other query was actually referring to). I also can't speak to the customized version of the listSFTP processor built in that other thread. Thanks, Matt
... View more
07-23-2024
06:18 AM
@mirkom NiFi is a flow based programming application. Processors configuration properties can get there values from parameter contexts which might be useful for you here. Some processors can get values by using NiFi Expression Language. NiFi is designed as as an "always on" with its dataflows using available scheduling strategies offered. Source processors (those with no inbound connections) need to have a valid configuration in order to start. Meaning the properties need at a minimum to execute must be available to the processor. So as a source processor, the only way to have those values is if they are set directly on the processor or pulled from a parameter context. In NiFi you can create a Process Group (PG) and then build a reusable dataflow within it (From you description it sounds like you have only a few different needed flow designs to meet your use cases). For you reusable dataflow, you should use the ListSFTP connected to a FetchSFTP to ingest data. ON a process group you can configure/define a "parameter context". A parameter context holds the unique configuration values for each of your source and dest host info. So you would have 500 different parameter contexts. So you can copy your PG many times and simply assign a different parameter context to each one making dataflow development a bit easier. So building out in this way makes expansion easier, but still requires some work. Also keep in mind that you have many use case where you are simply moving content from SFTP server A to SFTP server B. When you utilize NiFi for this use case, you are ingesting all that content into your NiFi cluster and they writing it back out another SFTP server. This adds some overhead in read and write operations versus a direct transfer between A and B without the local write that NiFi would do. What NiFi lest you do is manage all these dataflows through the NiFi UI. NIFi allows you to scale out by adding more nodes to the cluster as workload and volume increases easily without needing to modify your dataflows (assuming they are built well with data distribution implemented in the designs). Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-23-2024
05:44 AM
1 Kudo
@NagendraKumar You might want to try using the QueryRecord processor or ScriptedTransformRecord processor. Since you data is unstructured, you could try using the GrokReader and FreeFormTextRecordSetWriter. I agree that splitting and merging is not ideal with som many FlowFiles. ExtractText loads FlowFile content in to memory in order to parse it for extracting bits (High heap usage). MergeContent loads FlowFile metadata (FlowFile Attributes and metadata) in to heap memory for all FlowFiles allocated to merge bins (High Heap usage which can be managed via multiple MergeContent processor sin series limiting max bin FlowFile count). Hope this helps give you some alternate direction. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-22-2024
10:23 AM
@NagendraKumar ExtractText is only going to work with a well defined content structure. So when you have an unknown number of records in a single FlowFile, you would be better to split that multi-record file into single record files in which you can apply your ExtractText and ReplaceText dataflow against. You can then easily merge those split records back into the one file using a MergeContent with Defragment option. Since your files have an unknown number of records separated by a blank line, the SplitContent processor can easily used to split source FlowFile into individual record FlowFiles. The "Byte Sequence" is simply two line returns. After your ExtractText and ReplaceText processors, you can recombine all the splits to one FlowFile using MergeContent as below: Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-19-2024
12:57 PM
@NagendraKumar Often times there is more then 1 may to solution a use case. Here is one possible solution: NiFi Components used: SplitRecord Used to split your multi-row CSV record in to individual records. This processor will use a CSVReader: and CSVRecordSetWriter: The "Splits" relationship then gets routed to a ReplaceText processor (used to reformat the individual line record): "Search Value" based on four items per line (header and body): ^(.*?),(.*?),(.*?),(.*?)[\r\n]+(.*?),(.*?),(.*?),(.*?)[\r\n]+ "Replacement Value": The "Success" relationship is then routed to a MergeContent processor (used to recombine the original multi-records into a single FlowFile): Note: Demarcator is configured with line return to provide a new line between records in content. The assemble portion of this dataflow looks like this: Above is a working solution based on your shared example. It works no matter how many CSV rows exist in the source file. Other possibilities: I feel like this use case could also be accomplished using maybe the ScriptedTransformRecord processor. I am just not sure myself on how to write the scripted needed here correctly. Perhaps others in the community have suggestions. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-18-2024
05:40 AM
@Ali_12012 The InvokeHTTP processor utilizes the OkHTTP client library. This library does not support a body in a get request: https://github.com/square/okhttp/issues/3154 I am not familiar myself with what other client libraries exist that support this method, but guessing there must be some out there since postman handles this for you. The script process allows you to create a custom code that can use whatever client libraries you want. You could also build you own custom processor that utilizes some other client that may be able to identify that supports get with a body. Sorry I can't be of more help here. As @SAMSAL shared, there are reason why this is not supported or standard convention. Postman does not adhere to those standards and lets you do what you want. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-17-2024
10:20 AM
1 Kudo
@PriyankaMondal In version of Apache NiFi older then 1.16, NiFi does not allow any edits within the NiFi cluster while a node is disconnected. Changes are only allowed on the actual disconnected node. In Apache NiFi 1.16.0 NiFi introduced a new flow inheritance feature that allowed joining nodes with an existing flow.xml.gz/flow.json.gz that does not match the cluster elected flow to join the cluster by inheriting the cluster elected flow. A joining node would only be blocked from this process if the inheritance of the cluster flow would result in dataloss (meaning the joining node's flow contains a connection holding queued FlowFiles and the cluster elected flow does not have that connection). Later it was determined that this change can make it difficult handle the outcome of above issue. https://issues.apache.org/jira/browse/NIFI-11333 So it was decided that the best course of action was not allow any component deletion while a node is disconnected. When a NiFi node is started it attempts to join that node to the cluster. If the nodes fails to join the cluster, it shuts back down to avoid users from mistakenly using it as a standalone node. That means user had no easy way to handle the queued data in connection preventing the rejoin. Of course users could configure the node to come up standalone, but that does not make things any easier on the end user. The node loads up standalone, loads its FlowFiles and depending in whether auto.resume was set or not, start processing FlowFiles. This still leaves the user with FlowFiles queued in many connection all throughout the UI would have a very difficult time determining which connection(s) were removed and need to be processed out in order to rejoin the cluster. So decision was made to stop allowing deletion when a node is disconnected. That being said, when a NIFi cluster has a disconnected node, users can decide to navigate to the cluster UI and drop the disconnected node(s) from the cluster. The cluster will now have full functionality again as it will report all existing nodes as connected. It will require a restart of the dropped node(s) to get them to attempt to connect to the cluster again. But keep in mind that when it attempts to join cluster and inherit the cluster flow you may run into the problem described above. Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-16-2024
05:43 AM
@3ebs The "Insufficient Permissions Untrusted proxy CN=Node_name,OU=NIFI" shown in the webui when you try to login is not an error. It is an authorization issue. It tells me that you have a multi-node NiFi cluster setup. You are accessing the UI of one of the NiFi cluster nodes where you are successfully authenticating your user resulting the a user identity of "AMOHAMED279". At this point your user is only successfully authenticated to the one node. What that node does next is to load the NiFi canvas. In order to display that canvas, information that the user is authorized to see (PG, stats, etc) must be collected from all nodes. That requets is forwarded to the elected cluster coordinator node which then replicates that request to all nodes to get those details. So the node itself acts as a proxy in this process making these requests on the authenticated users behalf. In order for this to be successful, the NiFi nodes in your cluster must be authorized to proxy user requests. This message is telling you that one or more of your node identities has not been authorized to proxy user requests. To help here more, I would need to know what you have configured in the authorizers.xml for user identity authorization. The most common NiFi cluster setup utilizes the standardManagedAuthorizer which calls the file-access-policy-provider (builds the authorizations.xml if it does not already exist) which call one of the user-group-providers (There are multiple options: Composite-Configurable-User-Group-Provider, Composite-User-group-Provider, Ldap-User-Group-Provider, File-User-Group-Provider, etc.). The user-group-providers are responsible for generating user identities (case sensitive) for the purpose of setting up authorization policies. The file-user-group-provider is most commonly used to add the node user identities by creating the users.xml (if it does not already exist). So somewhere in your authorizers.xml setup, your node user identities have not been added and/or authorized for various policies to include the very important "proxy user requests" which would have been automatically handled on initial startup and first creation of the authorizations.xml and users.xml files assuming a proper setup in the authorizers.xml. Resources: Authorizer Configuration FileUserGroupProvider LdapUserGroupProvider Composite Implementations FileAccessPolicyProvider StandardManagedAuthorizer Configuring Users & Access Policies Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-16-2024
05:18 AM
1 Kudo
@PradNiFi1236 Not much information provided here for investigation use. What is the jar that is causing issue? How is the jar execution being invoked? What is the full exception being encounter (is there a stack trace with the exception?) If you install JDK 1.8.0_312 and launch Apache NiFi 1.17 using that JDK version, does the issue persist? Please help our community thrive. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more