About MattWho

MattWho · ‎06-16-2021

@ubet You can run NiFi-Registry on Windows. The exception you are see is related to being unable to load a class that is included with NiFi-Registry. Maybe your download is no good. Did you try downloading the NiFi-Registry package again? Make sure you purge the work directory that was created by NiFi-Registry when the service attempted to start. Make sure you did not run out of disk space. You'll also want the product installation directory path and the directory path for the java JDK being used to NOT have spaces in the directory names. Use "Apache-nifi-1.12.1" instead of "Apache nifi-1.12.1". Reinstall your JDK outside of "Program Files". The above changes and checks should get you up and running successfully. If you found this addressed yoru query, please take a moment to login and click "Accept" on this solution. Thank you, Matt

MattWho · ‎06-15-2021

@Arash The "nifi.remote.input.socket.port" property is used so that the receiving NiFi can support the RAW transport protocol. This port has nothing to do with where within your dataflows the transferred data is being ingested. The Remote Process Group (RPG) that exists within your MiNiFi dataflow acts as a client. It will execute a background thread every 30 seconds that connects to the target NiFi URL configured in the RPG that fetches NiFi Site-To-Site (S2S) details from the target NiFi. These S2S details include details about the target. This information includes but is not limited to the following: 1. Hostnames of all nodes in target NiFi cluster 2. Remote (RAW) input socket port if configured 3. FlowFile load on each cluster node 4. Remote input and output ports this client is authorized access to. There is no option to configure multiple remote input socket port values. RPG would not know how to use them even if there was. In your case each unique MiNiFi should have be pushing to a different Remote input port on the canvas instead of all of them sending to same input port. Second option is to use a RouteOnAttribute processor after your single Remote input port that routes data based on the "s2s.host" attribute value set on the received FlowFiles. This attribute is set by the sending RPG on each MiNiFi to the hostname of that MiNiFi. You could of course also set a unique attribute on each FlowFile via na UpdateAttribute processor on each MiNiFi also before sending through the RPG. If you found this addressed your query, please take a moment to login and click "Accept" on solutions that helped you. Thank you, Matt

MattWho · ‎06-14-2021

@Justee The ExecuteProcess [1] processor by design does not allow an inbound connection. It is designed to create an output FlowFile and does not except a FlowFile as input. You can connect getFile to the ExecuteStreamCommand [2] processor [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apache.nifi.processors.standard.ExecuteProcess/index.html [2] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apache.nifi.processors.standard.ExecuteStreamCommand/index.html

MattWho · ‎06-14-2021

@Justee Not really sure what you mean by "combine". Can you elaborate on your use case? You can drag a connection from GetFile to either ExecuteProcess or ExecuteStreamCommand, but is suspect that is not what you are asking here. Thank you, Matt

MattWho · ‎06-14-2021

@Rupesh_Raghani NiFi was not designed to provide a completely blank canvas to each user. There are important design reason for this. NiFi runs within a single JVM. All dataflows created on the canvas run as the NiFi service user and not as the user who is logged in. This means that all user's dataflows share and compete for the same system resources. Another user's poorly designed dataflow(s) can have an impact on the operation of another user's dataflow(s). So it is important for one users to be able to identify where backlogs may be forming even if that is occurring in another user's dataflow(s). With a secured NiFi, authorization policy control what a successfully authenticated user can see and do on the NiFi canvas. While components added to the canvas will always be visible to all users, what is displayed on the component is limited only stats for unauthorized users (no component names, component types, component configurations, etc). So an unauthorized user would be unable to see how that unauthorized component is being used and for what. The unauthorized user would also not have access to modify the component, access FlowFiles that traversed those components (unless that data passed through an authorized component somewhere else in the dataflow(s)), etc. Besides resource usage, another reason users need to see these place holders for all components is so that users do not build dataflows atop one another. It is common for multiple teams to be authorized to work within the same NiFi. It is also common to have some users who are members of more than one team. For those users, it would be very difficult to use the UI if each teams flows were built on top of one another. Most common setup involves an admin user creating a single Process Group (PG) on the root canvas level (top level - what you see when you first log in to a new NiFi). Then each team is authorized only to their assigned PG. So team1 user logs in and there PG is fully rendered and non authorized PGs are present by non configurable and no displayed details. team1 is unable to add components to canvas at this level and must enter their authorized PG before they can start building dataflows. When you enter sub-PG, you have a blank canvas to work with. Hope this helps with your query. Matt

MattWho · ‎06-14-2021

@midee You could use a routeOnContent [1]processor to accomplish this. You would create a Java regex that matches only on customfield(s) where there is a string wrapped in quotes. If found, it routes entire FlowFile to the relationship created using the dynamic property's name. RouteOnContent configuration: I noticed in you example you have two customfield entries that do not have "null" "customfield_10001": "This is required value", "customfield_10002": "", Based on my regex provided above, both of these would match resulting in FlowFile being routed to the "NotNull" relationship. If i were to change the second .*? to .+? , then the customfield that contained only quotes would not match (just in case you only want to route when it is not null and not empty. "customfield_.+?": ".*?", versus "customfield_.+?": ".+?", If you found this addressed your query, please take a moment to login and click "accept" on this solution. Thank you, Matt

MattWho · ‎06-09-2021

@midee This use case is really not clear to me. The image you shared is the content of a single FlowFile and that content has numerous "customfield_" fields with most being "null" and one having a string value. So you are asking that this 1 FlowFile with both null and non-null "customfield_" fields is routed to the path A because at least one "customfield_" field as a non-null string? The content would remain unedited. And you want other FlowFiles where the content contains nothing but all "customfield_" fields with null value routed to path B? The content would remain unedited. Thanks, Matt

MattWho · ‎06-09-2021

@myuintelli2021 Let's start with your mapping pattern setup here: nifi.security.identity.mapping.pattern.dn=^CN=(.*?), OU=(.*?)$ nifi.security.identity.mapping.value.dn=$1 nifi.security.identity.mapping.transform.dn=LOWER You node hostnames look like this: CN=nifi4.{valid_domain}.com, OU=NIFI So if we ran your hostname against the pattern Java Regular expression we would see: Capture group 1 (.*?) would match on nifi4.{valid_domain}.com Capture group 2 (.*?) would match on NIFI Then the value $1 used is only what came from capture group 1, so the string that would get passed to the NiFi authorizer would be nifi4.{valid_domain}.com You log output does reflect this now: 2021-06-08 15:33:19,173 WARN [NiFi Web Server-15] o.a.n.w.s.NiFiAuthenticationFilter Rejecting access to web api: Untrusted proxy nifi3.{valid_domain}.com The problem you have is that your file-user-group-provider is still using the full DN when setting up your clients and policies for your nodes: <property name="Initial User Identity 2">CN=nifi2.{valid_domain}.com</property> <property name="Initial User Identity 3">CN=nifi3.{valid_domain}.com</property> <property name="Initial User Identity 4">CN=nifi4.{valid_domain}.com</property> Above lines should be now: <property name="Initial User Identity 2">nifi2.{valid_domain}.com</property> <property name="Initial User Identity 3">nifi3.{valid_domain}.com</property> <property name="Initial User Identity 4">nifi4.{valid_domain}.com</property> AND in the file-acces-policy-provider: <property name="Node Identity 1">CN=nifi2.{valid_domain}.com</property> <property name="Node Identity 2">CN=nifi3.{valid_domain}.com</property> <property name="Node Identity 3">CN=nifi4.{valid_domain}.com</property> Above needs to change to: <property name="Node Identity 1">nifi2.{valid_domain}.com</property> <property name="Node Identity 2">nifi3.{valid_domain}.com</property> <property name="Node Identity 3">nifi4.{valid_domain}.com</property> You will need to remove the users.xml and authorizations.xml files again, so that they get recreated on NiFi startup after making these changes. Thank you, Matt

MattWho · ‎06-08-2021

@Leopol Welcome to NiFi! The ListFile [1] processor is only designed to create a 0 byte NiFi FlowFile (no content is fetched). This created NiFi FlowFile simply has a a bunch of Attributes created on the FlowFile that can be used later to actually retrieve the content via the FetchFile [2] processor. The combination of these two processors allow NiFi to spread the heavy work across multiple nodes in a cluster when the source of the data may not be cluster friendly (for example a remote disk mounted to all nodes in a NiFi cluster). The ListFile processor would be configured to execute on "Primary Node" only and its success relationship would be routed via a connection to the FetchFile. That connection would be configured to load balance the 0 byte FlowFiles produced by ListFile. Then the FetchSFTP processors executing on all nodes would get the now distributed files and fetch the content. There are other similar list/fetch combinations. Since you have left the FetchFile processor out of yoru dataflow, you are not passing any content to the UnpackContent processor thus resulting in the exception you are seeing. In that exception you will see details on the FlowFile trying to be unpacked: StandardFlowFileRecord[uuid=cfe7807c-d6ad-4127-b779-75b2f57c0ba6,claim=,offset=0,name=data.zip,size=0] You'll notice the "size=0" which menas it is 0 bytes which is expected since you have not fetched the content for this file yet. [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apache.nifi.processors.standard.ListFile/index.html [2] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-standard-nar/1.13.2/org.apache.nifi.processors.standard.FetchFile/index.html If you found this helped with your query, please take a moment to login and click "Accept" on this solution. Thank you, Matt

MattWho · ‎06-08-2021

@AnkushKoul You would need to do this through some other monitoring flow. When a NiFi components encounters a failure, it will produce a bulletin which correlates to an ERROR log entry. NiFi has a SiteToSIteBulletingReportingTask [1] which can be setup to send these produced bulletins over Site-To-Site (S2S) to another NiFi or this same NiFi as FlowFiles which can be parsed via a dataflow and notifications sent out via email. [1] https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-site-to-site-reporting-nar/1.13.2/org.apache.nifi.reporting.SiteToSiteBulletinReportingTask/index.html If you found this addressed your query, please take a moment to login and click "Accept" on this solution. Thank you, Matt

Online	Offline
Last Visited	‎12-26-2025 01:55 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎12-26-2025 01:55 PM
Posts	3,406
Kudos received	1618

Cloudera Community

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: How to elevate a default nifi user to admin - ...

Re: NiFi EnvokeHTTP - putting current date on HTTP...

Re: Invoking Nifi rest api in Data Flow

Re: Apache nifi registry in windows 10

Re: Multiple Values for Nifi.Remote.Input.Socket.P...

Re: Combine GetFile and ExecuteProcess

Re: Combine GetFile and ExecuteProcess

Re: Apche Nifi Show Templates and Workflows to onl...

Re: To loop through the dataflow

Re: To loop through the dataflow

Re: Nifi untrusted proxy caused by Untrusted Proxy...

Re: [Apahe Nifi] Module UnpackContent

Re: Error handling ListSFTP