About MattWho

MattWho · ‎09-12-2022

@AnkurKush It is impossible to provide a very specific solution without understanding the exact structure of your ldap user and group entries. You should obtain the output from the ldapsearch command for a sample user and sample group you will be authorizing in NiFi. That output will help you correctly configure the empty properties needed. Some general configuration guidance: - You should avoid syncing ALL users and groups from you ldap. ldap can contain thousands of users and groups and when you sync all of these to NiFi, these users and groups identities will be loaded into NiFi's heap memory. So limiting what is synced to the specific users and groups that will be accessing your NiFi will help reduce heap usage. This can be controlled using the correct "User Search Filter" and "Group Search Filter" settings. - I recommend always setting the "Page Size" setting to 500. ldap server often is configured to limit max number fo returns in a single request of 500 or 1000. If the return set is larger then that, returns will be missing if you do not configure this property. it has not impact if there are fewer returns then the set page size of 500. Specific guidance: - When it comes to actually syncing user and group identity strings, the following section must be configured: <property name="User Search Base">ou=people,dc=example,dc=net</property> <property name="User Object Class">person</property> <property name="User Search Scope">ONE_LEVEL</property> <property name="User Search Filter"></property> <property name="User Identity Attribute"></property> <property name="User Group Name Attribute"></property> <property name="User Group Name Attribute - Referenced Group Attribute"></property> <property name="Group Search Base">ou=groups,dc=example,dc=net</property> <property name="Group Object Class">groups</property> <property name="Group Search Scope">ONE_LEVEL</property> <property name="Group Search Filter"></property> <property name="Group Name Attribute"></property> <property name="Group Member Attribute"></property> <property name="Group Member Attribute - Referenced User Attribute"></property> 1. Leaving "User Identity Attribute" and "Group Name Attribute" tell NiFi which property/attribute to use from the ldap return as the identity string for the returned user or group. without these set, you'll get no response. 2. "User Group Name Attribute" in the user sync section tells NiFi's ldap-user-group-provider which attribute from the ldap returned user entry contains groups that the returned user belongs to. Sometimes there is no group association in the user entries and this is blank. Without this set, NiFi will not be able to determine groups associated to users via the user sync and that association must be done via the group sync. 3. "Group Member Attribute" in the group sync section tells NiFi's ldap-user-group-provider which attribute from the ldap returned group entry contains the users that belong to this group. Without this set, NiFi will be unable to determine which users are associated to the returned groups. 4. The two "Reference group/user attribute" properties are needed when the user or group strings strings retuned from the configured property in 2 or 3 above are not full Distinguished names for the user or group. In this case, this would be used to define the attribute that contains the actual exact matching string. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎09-12-2022

@winbob - What does the output of "ps -ef|grep -i nifi" return on your server where NiFi-Registry is installed? - Is this an Apache NiFi-Registry install or a Cloudera managed NiFi-Registry install? (I see you mentioned you're not sure how it was installed, but uninstalling the service properly differs depending on how it was installed.). Assuming this is just an Apache NiFi-Registry installation on Linux, the most common installation method is simply to unpack the tar.gz NiFi-registry file in user defined install location (output from "ps -ef" command will help determine where it is installed). Then the user would configure the nifi-registry.properties file in the conf directory and start the service via "<install/unpack directory>/bin/nifi-registry.sh start" command. 1. So step one is to stop NiFi-Registry: - Check to see if NiFi-Registry was installed as a linux service "systemctl status nifi-registry". If this returns then NiFi-Registry was installed as a service. If it returns unknown service then it was not. Assuming this returns you can stop the service using "systemctl stop nifi-registry". Then issue following commands to remove the linux service setup (does not uninstall product so will still need to complete steps 2 forward). systemctl disable nifi-registry rm /etc/systemd/system/nifi-registry rm /usr/lib/systemd/system/nifi-registry - If not a managed linux service, stop NiFi-Registry using "<install/unpack directory>/bin/nifi-registry.sh stop" ("ps -ef | grep -i nifi" should not return any processes for NiFi-Registry after executing this command). 2. Now you must inspect some NiFi-Registry conf directory files to identify any directory location defined by the user who installed the product. (default would have them all located with the directory tree where NiFi-Registry was unpacked, but very often these path are manually changed). Search the following configuration files for directory paths: - nifi-registry.properties - authorizers.xml - providers.xml - logback.xml NOTE: if any of the directory property paths start with "./" then NiFi-Registry is building the directory path tree within the existing NiFi-Registry installation path tree. 3. Once you have all the path external to the NiFi-Registry installation base path located, you can start deleting these directories and there sub-directories and contents. 4. Delete the NiFi-Registry installation directory tree. The product is now uninstalled. If you found that the provided solution(s) assisted you with your query, please take a moment to login and click Accept as Solution below each response that helped. Thank you, Matt

MattWho · ‎08-22-2022

@yongki "Remote" state should only be configured in the tailFile processor when the directory containing the file being tailed is mounted on every node in the NiFi cluster (meaning the flow running on each NiFi cluster node has access to the exact same file being tailed). If it is a shared directory/file, then the tailFile must also be configured to execute on "Primary node" only. Matt

MattWho · ‎08-15-2022

@VJ_0082 Since your log is being generated on a remote server, You will need to use a processor that can remotely connect to the exteranl server to retrieve that log Possible designs: 1. The could incorporate a FetchSFTP processor in to your existing flow. I assume your existing RouteOnAttriibute processor is checking for when an error happens with your script? If so, add the FetchSFTP processor between this processor and your PutEmail processor. Configured the FetchSFTP processor (configured with "Completion Strategy" of DELETE) fetch the specific log file created. This dataflow assumes the log's filename is always the same. 2. This second flow could be built using the ListSFTP (configured with filename filter) --> FetchSFTP --> any processors you want to use to manipulate log --> PutEmail. The ListSFTP processor would be configured to execute on "primary" node and be configured with a "File Filter Regex". When your 5 minute flow runs and if it encounters an exception resulting in the creation of the log file, this listSFTP processor will see that file and list it (0 byte FlowFile). That FlowFile will have all the FlowFile attributes needed for the FetchSFTP processor (configured with "Completion Strategy" of DELETE) to fetch the log which is added to the content of the existing FlowFile. If you do not need to extract from or modify that content, your next processor could just be the PutFile processor. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎08-11-2022

@VJ_0082 I apologize if I am still not clear about your Script being executed by ExecuteStream command. The Script is writing to some log file on disk where (locally on the NiFi node host or some external server where NiFi is not located) when it fails/errors? I assume your flow is passing the "original" FlowFile with a new FlowFile Attribute "execution.error" with your exception/error in it? Does this attribute contain the details you want to send in your email? Then you are routing based on what attribute from Original FlowFile (execution.status)? How is your PutEmail processor configured. It can be configured to send either a FlowFiIe attribute or FlowFile content. If the content you to sent via PutEmail is not in the Content of the FlowFile and is also Not in a FlowFile Attribute on the FlowFile, but rather written to some location on disk, you would need a separate dataflow that watches that log file on disk for new entries ingesting them into a FlowFile that feeds a putEmial processor. This could be accomplished using the TailFile processor. If that content is being written out a log file on remote system that becomes a different challenge. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎08-11-2022

@Ray82 Thank you for the additional details. It appears we have a disconnect in NiFi terminology being used. A NiFi FlowFile is the object that is passed from one component to another via connection on the NiFi UI canvas. A FlowFile consists of two parts: 1. FlowFile Content - This is the content for the FlowFile and is stored in the NiFi content repository within a content claim. A single content claim may contain the content fro one too many FlowFiles. 2. FlowFile Attributes/Metadata - Written to the NiFi FlowFile repository. This is a collection of Metadata about the FlowFile such as which claim contains the content for the FlowFile. The offset within that claim where the content starts and then the length of the content. It also contains attributes associated to the specific FlowFile like filename, size, etc. User can also add additional FlowFile Attributes. These FlowFile attributes are not part of the content of the FlowFile. - Your use case starts with an existing FlowFile with specific content - So I am guessing that you are extracting portions of content from your original FlowFile and assigning them to NiFi FlowFile attributes? - Then using those FlowFile attributes in the SQL query executed by the ExecuteSQL processor? The ExecuteSQL processor is designed to write the SQL query response to the content of the FlowFile (new content, does not append to existing content) - What you really want to do is preserve the original FlowFile and add another FlowFile Attribute to it that was retrieved using your ExecuteSQL overwriting the original FlowFile content? Maybe you could use the DistributedMapCache processors to accomplish this: The processor in the upper right corner is an UpdateAttribute processor. It is used to create a cacheID that will persist on both FlowFiles output from this processor (original and clone since "success" relationship was drawn twice). It simply takes the UUID from original FlowFile and adds it to each FlowFile in the FlowFile attribute "cacheID". Then you would still have your ExecuteSQL and ExtractText flow to replace content with just "A3" model. Then I configure both my PutDistributedMapCache and FetchDistributedMapCache processors to use "${cacheID} as the "Cache Entry Identifier". The FlowFile with the original content will move to the FetchDistributedMapCache processor where it will loop in the "not.found" relationship connection until the other FlowFile using same cacheID value writes you model "A3" to that unique cache entry via the PutDistributedMapCache (writes content of FlowFile to cache entry) processor. FetchDistributedMapCache: PutDistributedMapCache: In my example I am using the "DistributedMapCacheClientService" (simply because it is quick and easy), but there are better options that offer High Availability (HA). The DistributedMapCacheClientService requires that NiFi also has a DistributedMapCacheServer Controller service for it talk to and store your cache entries in. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎08-11-2022

@PradNiFi1236 What do you see in the nifi-user.log on the NiFi receiving the FlowFiles from this reporting task? I would expect it to show the 401 and hopefully the identity string that was used to identify the client. Matt

MattWho · ‎08-10-2022

@kellerj You can use the NiFi-Registry Rest-Api to change the description on an existing version controlled flow. https://nifi.apache.org/docs/nifi-registry-docs/rest-api/index.html#updateFlow curl -X 'PUT' 'https://<nifi-registry-hostname>:<nifi-registry-port>/nifi-registry-api/buckets/<bucket UUID>/flows/<versioned flow UUID>' -H 'Authorization: Bearer <token>' -H 'Content-Type: application/json' --data-raw '{"bucketIdentifier":"<bucket UUID>","description":"new-description","name":"<versioned flow name>","type":"Flow"}' --compressed --insecure My NiFi-Registry is secured and uses a login provider for user authentication. If you are doing the same, you will need to obtain a bearer token for your user who has read/write/delete on the bucket containing the flow you want to modify. The rest of what you need to make this rest-api call can be obtained via the NiFi-registry's UI for the flow to be modified: Using above example: <versioned flow name> = "test" <bucket uuid> = "17ca6981-c4a1-418f-807f-ab0b72a997ff" <versioned flow UUID> = "fa88a095-6867-45de-9dba-23e828224d3d" If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎08-10-2022

@SandyClouds The ExecuteSQL processors do not support SSH tunnel. The expectation by these processors is that the SQL server is listening on a port reachable on the network. SSH tunnels are used to access the server via remotely and then execute a command locally on that SQL utilizing the SQL client on that destination server. The ExecuteSQL processor uses a DBCPConnectionPool to facilitate the connection to the database. The DBCPConnectionPool establishes a pool of connections used by one too many processors sharing this connection to execute their code. A Validation Query is very import to make sure a connection from this pool is still good before being passed to requesting processor for use. While I have not done this myself, I suppose you could set up and SSH tunnel on each NiFi cluster server (example: https://linuxize.com/post/mysql-ssh-tunnel/). Then you could still use the DBCPConnectionPool except use the established tunnel address and port in the database connection URL. Downside to this is that NiFi has not control over that tunnel, so if the tunnel is closed, your dataflow will stop working until the tunnel is re-established. The Validation Query will verify the connection is still good. If it is not, the DBCPConnectionPool will drop it and try to establish a new connection. If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

MattWho · ‎08-10-2022

@Ray82 I am assuming you are using the ExecuteSQL processor to execute the SQL Select statement example you shared. The response would be written to the content of the FlowFile pass to the success relationship. You could use the ExtractText processor to extract content from the FlowFile and assign it to a new FlowFile attribute you name "model". If you found this response assisted with your query, please take a moment to login and click on "Accept as Solution" below this post. Thank you, Matt

Member Since	‎07-30-2019 10:41 AM
Last Visited
Posts	2,915
Kudos received	1442

Cloudera Community

Re: Querying Data Provenance using FlowFile Attrib...

Re: Nifi PutSFTP failed to rename dot file when "O...

Re: NPE with InferAvroSchema

Re: Apache Nifi 2.0-M2 integration with Apache Ign...

Re: Can the spring framework version of nar in the...

Re: Need help in Configuration for User group Poli...

Re: Uninstall NiFi

Re: How to remove state of deleted log files in Ni...

Re: Put error message in PutEmail message body

Re: Put error message in PutEmail message body

Re: Apache Nifi - Add attribute to Flow File based...

Re: Problem in Establishing Site2Site Reporting ta...

Re: Update Flow Description in NiFi Registry

Re: Connecting to a Database using SSH tunnel in N...

Re: Apache Nifi - Add attribute to Flow File based...