About MattWho

MattWho · ‎09-23-2025

@Bern The two outputs you shared are form two different Site-To-Site Reporting tasks. The first you shared is produced by the SiteToSiteBulletinReportingTask. Additional Details... The second you shared is produced by the SiteToSiteStatusReportingTask. Its fields will vary based upon the type of component. Additional Details... The exceptions you shared are bulletins only and always being reported as issue sending to http://node-1:8080/ node. I see all your other configuration are based off IPs. Are all you NiFi nodes able to properly resolve "node-1" to the correct IP address? Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-23-2025

@Bern For NiFi site-to-site (S2S), you can NOT have each node configured differently (other then each node's unique hostname being set). The way Site-To-Site works is as follows: The Destination URL is configured with a comma separated list of NiFi URLs for the hosts in the target NiFi cluster (adding a comma separated list allows S2S to still function if one of the nodes in the target cluster is down). So you can configure just one target URL if you want and it will still work. If you your NiFi cluster is secured, the the destination URLS must also be https urls. So S2S attempts to connect to the first URL in the list to fetch S2S details (number of nodes in cluster, cluster hostnames, is http enabled, RAW port of each node, load on each node, etc) about the target cluster. The S2S details are rechecked every 30 seconds to see if they have changed (for example adding another node or removing a node from target cluster). Then S2S uses that information to distribute FlowFile across all nodes in the destination NiFi cluster. The client (SitetoSiteStatusReprotingTask) dictates whether you want to use RAW or HTTP transport protocols. If using RAW, make sure the RAW port is not in use on any of the nodes already. Take a look in the nifi-app.log for the exception as it is likely to include a full stack trace with it that may shed more light on your issue. It would be hard for me to say exactly what you issue is unless i knew your NiFi setup (nifi.properties) and the specific configuration of your SiteToSiteStatusReporting Task. What do you encounter if you use HTTP instead of RAW transport protocol? I'd also suggest starting a new Community question as this new question is not related to your original question in this post. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-23-2025

@HoangNguyen The "Deprecation log" has nothing to do with your running dataflows on your NiFi canvas. The deprecation log contains notifications about NiFi components you may be using that are deprecated. Deprecated components get removed in future Apache NiFi versions. This log is to make you aware of this so you can make dataflow design changes to stop using them before you migrate to newer Apache NiFi release. The NiFi Standard Log Files include the "bootstrap log, app log, and user log". The app log is where you will find alll your dataflow component based logging. In the logback.xml, "logger" will write to the nifi-app.log by default unless a specific "appender-ref is declared for the logger. NiFi app.log can produce a lot of logging, but to get it all you can adjust: <logger name="org.apache.nifi" level="INFO"/> to "DEBUG" instead of INFO. It will be very noisy. Logback standard log levels: OFF: This level turns off all logging. No messages will be outputted. ERROR: Indicates a serious error that might still allow the application to continue running, but requires attention. WARN: Indicates a potentially harmful situation that should be investigated, but does not necessarily prevent the application from continuing. INFO: Provides general information about the application's progress and significant events. DEBUG: Offers detailed information useful for debugging purposes, often including variable values and execution flow. TRACE: Provides even finer-grained information than DEBUG, typically used for extremely detailed tracing of execution paths. ALL: This level enables all logging, including messages at all other levels. Keep in mind that just because you set DEBUG log level, does not mean every component will produce DEBUG level log messages. It all depends on what logging exists within the component class and dependent libraries. When set to DEBUG, it will log DEBUG and all level below it (INFO, WARN, ERROR). If you set "INFO", you also get WARN and ERROR logging. NiFi user authorization logging will go to the nifi-user.log. This is logging related to access to NiFi. Nifi-bootstrap.log has logging for you rNiFi bootstrap process. The bootstrap is what is lauched when you execute the nifi.sh start command. The bootstrap then starts the nifi main child process whcih loads your NiFi and dataflows. The bootstrap then monitors that child process to make sure it is still live (restarts it automatically if it dies). Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-22-2025

@Bern I am having difficulty clearly understanding your question. I will start by saying that Apache NIFi 1.11.4 was released way back in 2019 and will have many unresolved CVEs. I strongly encourage you to at least upgrade to the latest available NiFi 1.x release 1.28.1 as I know migrating to Apache NiFi 2.x versions takes a good amount of planning and likely some dataflow redesign work. I think first we need to get our terminology correct, so we can communicate clearly on the issue/question. NiFi processors are what you add to the canvas that perform specific tasks. Processors will have connections that allows you to connect a processor with another component (processor, input port, output port, funnel, etc). NiFi Reporting Tasks are added via the NiFi controller and perform their function in the background. Then you also have NiFi Controller Services which are services that are used by other components (processors for example). The SiteToSiteStatusReportingTask NiFi reporting task has been a part of Apache NiFi since 1.2.0 release, so it does exist in your 1.1.4 version. The screenshot you shared is showing a bunch of Controller Services, so you are in the wrong UI for adding a Reporting Task. You can find and add NiFi Reporting task by clicking on the NIFi Global menu in the upper right corner of the UI and selecting "Controller Settings" from the displayed menu: From the UI that appears, you will be able to select the "Reporting Tasks" tab: Click the box to the far right with the "+" symbol to bring up the UI for selecting the Reporting task you wish to add. NOTE: The list of available Reporting Tasks will vary by Apache NIFi release version. What I actually think you will want to use is the SiteToSiteBulletinReportingTask reporting task. You can use this Reporting task to send bulletins that your processors are producing to a NiFi remote Input Port. Your processors generate ERROR bulletins by default when issues occur, so you can build a dataflow that will process these bulletins send to it via this reporting task and do alerting as you need. For example: send an email using the putEmail processor to alert someone about specific errors. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-22-2025

@Kumar1243 Try using the following spec: [ { "operation": "shift", "spec": { "Product": [ "Product", "to_PlndIndepRqmtItem[0].Product" ], "Plant": [ "Plant", "to_PlndIndepRqmtItem[0].Plant" ], "MRPArea": [ "MRPArea", "to_PlndIndepRqmtItem[0].MRPArea" ], "PlndIndepRqmtType": [ "PlndIndepRqmtType", "to_PlndIndepRqmtItem[0].PlndIndepRqmtType" ], "PlndIndepRqmtVersion": [ "PlndIndepRqmtVersion", "to_PlndIndepRqmtItem[0].PlndIndepRqmtVersion" ], "RequirementPlan": [ "RequirementPlan", "to_PlndIndepRqmtItem[0].RequirementPlan" ], "RequirementSegment": [ "RequirementSegment", "to_PlndIndepRqmtItem[0].RequirementSegment" ], "PlndIndepRqmtPeriod": [ "PlndIndepRqmtPeriod", "to_PlndIndepRqmtItem[0].PlndIndepRqmtPeriod" ], "PlndIndepRqmtIsActive": "PlndIndepRqmtIsActive", "NoWithdrawal": "NoWithdrawal", "DeleteOld": "DeleteOld", "PeriodType": "to_PlndIndepRqmtItem[0].PeriodType", "PlannedQuantity": "to_PlndIndepRqmtItem[0].PlannedQuantity", "UnitOfMeasure": "to_PlndIndepRqmtItem[0].UnitOfMeasure", "ProductionVersion": "to_PlndIndepRqmtItem[0].ProductionVersion" } } ] You can use the JoltTransformRecord or JoltTransformJson processors. The JoltTransfromRecord will allow you to define a schema for your multi-record input FlowFiles. The JoltTransformJson processor would require you to split you source FlowFile first so you have one record per FlowFile. Hope this helps you get closer to success,. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-22-2025

@HoangNguyen As long as you are running a new enough version of Apache NiFi, you'll have an option with the process group configuration to set a logging suffix. For each process group you want a separate log file, create a unique suffix. In above example I used the suffix "extracted". In my NiFi "logs" directory, I now have a new "nifi-app-extracted.log" file that contains the logging output of every component contained within that process group. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-18-2025

@asand3r JVM Garbage collection is stop-the-world which would prevent for the duration of that GC event the Kafka clients from communicating with Kafka. If that duration of pause is long enough I could cause Kafka to do a rebalance. I can't say that you are experiencing that . Maybe put the consumeKafka processor class in INFO level logging and monitor the nifi-app.log for any indication of rebalance happening. When it comes GC pauses, a common mistake I see is individuals setting the JVM heap settings in NiFi way to high simply because the server on which they have in stalled NiFi has a lot of installed memory. Since GC only happens once the allocated JVM memory utilization reaches around 80%, large heaps could lead to long stop-the-world if there is a lot top clean-up to do. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-16-2025

@AlokKumar Then you'll want to build your dataflow around the HandleHTTPRequest and HandleHTTPResponse processors. You build your processing between those two processors or maybe you have multiple HandleHTTPResponse processors to control the response to the request based on the outcome of your processing. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-16-2025

@AlokKumar NiFi FlowFiles consist of two parts: FlowFile Metadata/Attributes - stored in the flowfile_repository, it holds metadata about the FlowFile and attributes added to the FlowFile by processors. FlowFile Content - Stored within content claims within the content_repository. A single content claim may hold the content for one too many FlowFiles. Part of a FlowFile's metadata includes the location of the content claim, the starting byte of the content and total number of bytes. There is also a claimant count associated with each content claim. It is incremented for every active FlowFile (a FlowFile still present with a queue on the NiFi canvas) that references content stored in that claim. One a FlowFile reaches a point of auto-termination within a dataflows, the claimant count on the content claim it references is decremented. Once the claimant count reaches zero, the claim is eligible for archive and removal/deletion. Content claims are immutable (can not be modified once created). Any NiFI processor that modifies or creates new content writes that content to a new content claim. Archived content claims are moved to "archive" subdirectories within the content_repository. Archiving can be disable which means that content claims where claimant count is zero are immediately deleted. A background archive thread monitors archived content claims and deletes them based on archive retention settings in the nifi.properties file. A common misunderstanding is how the "nifi.content.repository.archive.max.usage.percentage". Lets say it is set to 80%. Once this disk where the content_repository resides reaches 80% capacity, archive will start purging archived content claims to attempt to bring disk usage below that 80%. If all archived content claims have been deleted, NiFi will continues to allow new content claims to be created potentially leading to disk being 100% full. For this reason it is VERY important that the content_repository is allocated to its own physical or logical disk. File System Content Repository Properties Understanding-how-NiFi-Content-Repository-Archiving-works With NiFi provenance you are seeing Provenance event data which includes metadata about the FlowFile, If the content claim referenced by the FlowFile in the provenance event no longer exists on the content_repository (either inside archive subdirectory or outside archive), you'll have no option to replay or view the content. Provenance is written to its own provenance_repository directory and its retention is also configurable in the nifi.properties file. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

MattWho · ‎09-15-2025

@ShellyIsGolden What you describe here sounds like the exact use case for using NiFi's parameter contexts. Parameters can be used in any NiFi component property. They make it easy to build a dataflow in your dev environment and then move that dataflow to test or prod environments that have the same parameter contexts but with different values. This even works when using a shared NiFi-Registry to version control your ready dataflows for another environment. Lets say you create a "Parameter Context" and associate that created parameter context with a Process Group(s). Now you can configure a property in a processor for example and click on "convert to parameter" icon to convert that value into a parameter within yoru parameter context. Lets say you create a Parameter context with name "PostgresQL parameters". Then you can configure your Process Group (PG) to use that parameter context: Now you can configure/convert your component properties that are unique per NiFi deployment environment to using a parameter. Let's say you are ready to move that flow to another environment like prod. So you version control that PG on dev to NiFi-Registry. Then on Prod you connect to that same NiFi-Registry and import that dataflow. When that flow is loaded in Prod, if a parameter context with the exact same name "PostgresQL parameters" already exists, that imported flow will use that parameter context's values. This eliminates the need to manage these configuration all over the place in yoru dataflows. You can also open your parameter context and edit a values and NiFi will take care of stopping and starting all the affected components. Please help our community grow. If you found any of the suggestions/solutions provided helped you with solving your issue or answering your question, please take a moment to login and click "Accept as Solution" on one or more of them that helped. Thank you, Matt

Online	Offline
Last Visited	‎01-25-2026 02:21 PM

Member Since	‎07-30-2019 10:41 AM
Last Visited	‎01-25-2026 02:21 PM
Posts	3,426
Kudos received	1627

Cloudera Community

Re: Best Practice for configuring registry flows

Re: Nifi 2.7.2 Start Problem

Re: Error importing NiFi workflow template from ve...

Re: Error importing NiFi workflow template from ve...

Re: How to elevate a default nifi user to admin - ...

Re: Site to Site Status Reporting Task Error Notif...

Re: I dont have processor Nifi Site To Site Status...

Re: How to log message at the Processor Group Leve...

Re: I dont have processor Nifi Site To Site Status...

Re: Need a jolt transfomation for my json input

Re: How to log message at the Processor Group Leve...

Re: NiFi stuck data in queue between processors

Re: Soap Webservices in NiFi

Re: What happens to flowfiles in NiFi

Re: Nifi DBCPConnectionPool service not setting s...