Member since
02-05-2021
23
Posts
1
Kudos Received
0
Solutions
10-16-2023
02:19 PM
1 Kudo
@techNerd Clearing a processor components state requires stopping the processor before you can "clear the state". Stopped state is required because the the processor may be writing or updating state when you attempt to clear state which would cause issues. When stopped there is no need to worry about a race condition between writes and deletes. That being said, reseting the sequence number stored in state to 0 can be accomplished using the advanced UI of the UpdateAttribute processor and a special reset-seq flowfile you feed into the processor at 00:00 each day. The advance UI of the UpdateAttribute processor works like if,the,else logic. So you would set up a Rule "reset" and a condition (if), If condition is true the "Actions" are applied. If no Rules's conditions are true, the processor's non advanced UI properties are applied. UpdateAttribute properties (same as you already have): Click on "advanced" in lower left corner of processor configuration UI to open and configure Rules: Now all you need to do is setup a GenerateFlowFile processor that feeds a FlowFile into the updateAttribute processor once a day to reset seq to 0 stored in tat UpdateAttributes processor's local state. Optionally you could add a RouteOnAttribute processor after the UpdateAttribute to route out the sequence file for termination so it does not continue through your dataflow. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
09-27-2023
06:24 AM
@techNerd I don't see a question in your post. I can only assume you are talking about missing "key" policy icon on your NiFi Flow root process group? This indicates your authenticated user is not authorized to view or modify all policies. What is also interesting from your screenshot is that the user identity displayed in upper right corner is a UUID and not "CN=sys_admin, OU=NIFI" from your user certificate. So I think you have multiple issues here with your configuration. Inspect your nifi.properties, login-identity-providers.xml, and authorizations.xml files for configuration issues. Also take note that the file-user-group-provider ONLY creates the users.xml file if it does not already exist during startup. It does not modify an already existing file. The file-access-policy-provider generates the authorizations.xml (different file from authorizers.xml) ONLY if it does not already exist at startup. It will not modify an already existing file. What version of Apache NiFi is being used? Did you maybe leave remnants of the single-user-provider or single-user-authorizer configured? If so remove these two providers from your configuration. Below is more info about the "initial admin": The intent of the "Initial Admin" is to give that user just enough authority to function as a NiFi Admin (access the UI, access to view and modify tenants/user, create new users and groups identities (assumes file based authorization configured), access to assign or remove access policies to users/groups, access the NiFi controller settings and give view modify to root process group (if first start up with no pre-existing flow.xml.gz/flow.json.gz in place.). It is not meant to grant the admin to all policies, but admin has ability to add themselves to all policies. There are often clear devisions of responsibility between admins and dataflow designers/engineers. An admin not involved with creating flows would have no need to be able to build flow, access component configurations, view content, view data provenance, etc. So policies of this nature are not assigned as part of initial admin setup. If you found any of the suggestions/solutions provided helped you with your issue, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt
... View more
07-22-2023
01:40 AM
Thanks for the Awesome information!
... View more
06-09-2023
10:24 AM
@naveenb Your query will get better visibility by starting a new question in the community rather then asking on an already solved question. NiFi's ListSFTP and GetSFTP (deprecated in favor of listSFTP and FetchSFTP) processor only lists/gets files. When it generates a NiFi FlowFile from a file it finds recursively within the source SFTP server configured base directory, it adds a "path" attribute to that FlowFile. That "path" attribute has the absolute path to the file. So based on your configuration, the results you are seeing are expected since you configured your putSFTP with "/home/ubuntu/samplenifi/${path}" Were "path" attribute on your FlowFiles resolves to "/home/nifiuser/nifitest/sample" for files found in that source subdirectory. You can use NiFi expression language (NEL) to modify that "path" attribute string to get rid of the "/home/nifiuser" portion /home/ubuntu/samplenifi/${path:substringAfter('/home/nifiuser')} If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt
... View more
01-24-2022
04:47 AM
1 Kudo
Same Command Arguments are you able to run as user nifi (or the user who is responsible to run nifi service )against path which is determine by fileslocation outside of nifi at command prompt ?
... View more
01-19-2022
06:01 AM
@OliverGong I would avoid dataflow design when possible where you are extracting the entire contents of a FlowFile to FlowFile attribute(s). While FlowFile content only exists on disk (unless read in to memory by a processor during processing), FlowFile attributes are held in NiFi's JVM heap memory all the time (There is per connection swapping that happens when a specific connection reaches the swap threshold set in the nifi.properties file). FlowFiles with lots of attributes and/or large attribute values will consume considerable amounts of JVM heap which can lead to JVM Out Of Memory (OOM) exceptions, long stop-the-world JVM Garbage Collection (GC) events, etc... When options exist that avoid adding large attributes, those should be utilized. Thanks, Matt
... View more
01-10-2022
01:31 PM
@techNerd I think your scenario may need a bit more detail to understand what you are doing and what it is doing versus what you want the flow to do. The ListFile only listed information about file(s) found in the target directory. It then generates a one of more FlowFiles from the listing that was performed. A corresponding FetchFile processor would actually retrieve the content for each of the listed files. From the sounds of your scenario, you have instituted a 20 sec delay somehow between that ListFile and FetchFile processor? Or you have configured the run schedule on the ListFile processor to "20 secs"? Setting the run schedule only tells the processor how often it should request a thread from the NiFi controller that can be used to execute the processor code. Once the processor gets its thread, it will execute. The ListFile processor will list all files present in the target source directory based on the configured file and path filters. For each File listed it will produce a FlowFile. Run schedule does not mean it executes for a full 20 seconds continuously checking the input directory to see if new files arrive. The run schedule also not impacted by how long it takes a listing to complete. It will request a thread every 20 seconds (00:00:20, 00:00:40, 00:01:00, etc...). The configured "concurrent tasks" controls whether the processor can execute multiple listing in parallel. Let say the thread that was executed at 00:01:00 was still executing 20 seconds later. Since that thread is still using the default 1 concurrent task, the listFile would not be allowed to request another thread from the controller at that time. Since the run schedule is independent of the thread execution duration, there is no way to dynamically alter the schedule. There is also no way for a new file to get listed at same time as a previous file (unless both were already present at time of listing) within the same thread execution. The listFile use the configured "Listing Strategy" to control how it handles listing of files. A "tracking" strategy is used to prevent the ListFile processor from listing the same file twice by recording some information in a state provider or a cache. If "No Tracking" is configured, the listFile will list all found files every time it executes. ListFile does not remove the source file from the directory. Removal of the source file is a function optionally handled by the corresponding FetchFile processor. If this is not clear, share more details around your use case and flow design specific so I can provide more direct feedback. Here is the documentation around processor scheduling (works the same no matter which processor is being used): https://nifi.apache.org/docs/nifi-docs/html/user-guide.html#scheduling-tab If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
09-01-2021
12:55 PM
@techNerd After you fetch the two files, you could route the success relationship twice via two different connections. One of those connection continues those two files down you r existing dataflow path. The secondary path could go to a replaceText processor that replaces the entire content of the FlowFile with the filename which is store in a FlowFile attribute named "filename". Editing the content down this secondary path does not affect the content of the FlowFile in the other dataflow path. Now that you have the filename in the content you can use other processors to store that content with the filename wherever you want for future retrieval. or you could merge those flowfile in to single FlowFile that would then contain the list of filenames before you store it. This is just a matter of your specific use case for later consumption of this stored data. Take a look at the putDistributedMapCache processor as an example. If you found this response addressed your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt
... View more
07-29-2021
06:13 AM
Maybe a TransformXml with this XSLT might be more future proof: <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="node()">
<xsl:copy>
<xsl:apply-templates />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
... View more
06-08-2021
06:07 AM
@techNerd The PutSFTP processor contains the following configuration property: Do you have that set to false on the particular putSFTP processor throwing the exception? Thanks, Matt
... View more