Member since
07-30-2019
3131
Posts
1564
Kudos Received
909
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
106 | 01-09-2025 11:14 AM | |
661 | 01-03-2025 05:59 AM | |
394 | 12-13-2024 10:58 AM | |
428 | 12-05-2024 06:38 AM | |
357 | 11-22-2024 05:50 AM |
07-01-2016
12:06 PM
NiFi 1.0 is deep in to development right now. Expect to see it up for vote in August. NiFi 1.0 has considerable re-work done across the board. (New UI, No more NCM for clustering, etc...) Very exciting stuff.
... View more
06-30-2016
03:14 PM
6 Kudos
@Alexander Aolaritei NiFi can produce a lot of provenance data. The solution you are looking for will be coming in Apache NiFi 1.0 in the form of a NiFi reporting Task. This "SiteToSiteProvenanceReportingTask" will use the NiFi Site-to-Site (S2S) protocol to send provenance events to another NiFi instance in configurable batches. Of course that target NIfI instance could be yourself; however, that would just produce even more provenance events locally as you handle those messages. So It may be wise to standup another NiFi instance just for Provenance event handling. Upon receiving those provenance events via a S2S input port, you can use standard NiFi processors to split/merge them, route them, and store them in your desired end point (Whether that is local file(s), external DB, etc...). I am not a developer so cannot help with the custom solution you are working on, but just want to share what is coming as another viable solution to your needs. Thanks, Matt
... View more
06-28-2016
08:27 PM
1 Kudo
@AnjiReddy Anumolu Let me start off by making sure I fully understand the dataflow you have created to better answer your question. You have added a getFile processor to your flow which will pickup file(s) from a local file system directory and then sends them via the success relationship to a logAttribute processor. What did you do with the logAttributes's success relationship? If it is auto-terminated, you are essentially telling NiFi you are done with the files following a successful logging of the file(s) FlowFile attributes/metadata. If the success relationship has not been defined the processor will remain invalid and cannot be run. In this case the file(s) picked up by the getFile processor will remain queued on the connection between the getFile processor and the logAttribute processor. In either case, when NiFi ingests file(s) they are placed in the NiFi content repository. The location of the content repository is defined/configured in the nifi.properties file. The default places them in a directory created within the default NiFi installation directory: nifi.content.repository.directory.default=./content_repository NiFi stores file(s) in what are known as claims to make most efficient use of the system's hard disks. A claim can contain 1 to many files. The default claim configuration is also defined/configured in the nifi.properties file. The default configuration is as follows: nifi.content.claim.max.appendable.size=10 MB
nifi.content.claim.max.flow.files=100 For files smaller then 10 MB they may be stored with other files with up to 100 total files in a single claim. If a file is larger then 10 MB it will end up in a claim of one. At the same time files are written to a claim, FlowFile attributes/metadata is written about the ingested files in the flowfile repository. The location of the flowfile repository is also defined/configured in the nifi.properties file: nifi.flowfile.repository.directory=./flowfile_repository These FlowFile attributes/metadata will contain information such as filename, filesize, location of claim in content repository, claim offset, etc... The claim offset is the starting byte location of a particular file's content within a claim. The fileSize defines the number of bytes from that offset that makes up the compete data. The nifi-app.log contains fairly robust logging by default (configured in logback.xml file). When NiFi ingest files, NiFi will log that and that log line will contain information about the claim (location and offset). When NiFi auto-terminates FlowFiles they are removed from the content repository. Depending on the content repository archive setup, the file(s) may be archived for a period of time. In the case of archived file(s), it can be replayed using the provenance NiFi UI. Thanks, Matt
... View more
06-23-2016
09:41 PM
Was your VM restarted or the NiFi restarted since HDP was installed?
... View more
06-08-2016
12:19 PM
Glad I could help and good to hear you are now up and running.
... View more
06-03-2016
05:33 PM
You can edit files as root. Editing files does not change ownership. You just need to make sure at the end of editing all files are owned by the user who will be running yoUR NiFi instances.
Give yourself a fresh start and delete the flow.tar on your NCM and flow.xml.gz and templates dir on your Node. So at the end of configuring your two NiFi installs (one install configured to be NCM and one separate install configured to be a Node), you started your NCM successfully? Looking in the nifi-app.log for your NCM, do you see the following lines: 2016-06-03 ... INFO [main] org.apache.nifi.web.server.JettyServer NiFi has started. The UI is available at the following URLs:
2016-06-03 ... INFO [main] org.apache.nifi.web.server.JettyServer https://Bxxxxx.xxxxxx.com:8080/nifi You then go to your other NiFi installation configured as your Node and start it.
After it has started successfully it will start attempting to send heartbeats to Bxxxxx.xxxxxxx.com on port 1xxx. You should see these incoming heartbeats logged in the nifi-app.log on your NCM. Do you see these? INFO [Process NCM Request-1] o.a.n.c.p.impl.SocketProtocolListener Received request 411684b2-25cb-461f-978e-fb3bda6a7ef0 from Axxxxx.xxxxxx.com INFO [Process NCM Request-1] o.a.n.c.manager.impl.WebClusterManager Node Event: (......) 'Connection requested from new node. Setting status to connecting.' After that the NCM will either mark the node as connected or given a reason for not allowing it to connect
If you are not seeing these heartbeats in the NCM nifi-app.log, then something is blocking the TCP traffic on the specified port. I did notice in the above example you provided 1xxx as your cluster manger port. Is that port above 1024? Ports <= 1024 are reserved and can't be used by non root users. If you are running your NCM as a user other then root (as it sounds by the above) NiFi will fail to bind to that port for listening for these heartbeats. Matt
... View more
06-03-2016
04:13 PM
1 Kudo
A fresh install of NiFi has no flow.xml.gz file until after it is started for the first time.
Are these fresh NiFi installs or installations that were previously run standalone? - if that is the case you can't simply tell them they are nodes and NCMs and expect it to work. Your NCM does not run with a flow.xml.gz like your nodes and standalone instances do. The NCM uses a flow.tar file. The flow.tar would be created on startup and contain an empty flow.xml. When you started your Node (with existing flow.xml.gz file) it would have communicated with NCM but been rejected because the flow on the node would not have matched what was on the NCM. If you are looking to migrate form a standalone instance to a cluster, I would suggest reading this:
https://community.hortonworks.com/content/kbentry/9203/how-to-migrate-a-standalone-nifi-into-a-nifi-clust.html Let me make sure understand your environment:
1. you have two different installation of NiFi. 2. One installation of NiFi is setup and configured to be a non-secure (http) NCM 3. One instance of NiFi is setup and configured to be a non-secure (http) Node. 4. The # cluster common properties (cluster manager and nodes must have same values) # section in the nifi.properties files on both NCM and Node(s) are configured identical 5. In that section on both nifi.cluster.protocol.is.secure=false is configured as false (Cannot be true if running http.) 6. The # cluster node properties (only configure for cluster nodes) # has been configured only on your node. - The following properties in the above node section are configured nifi.cluster.is.node=true nifi.cluster.node.unicast.manager.address= nifi.cluster.node.unicast.manager.protocol.port= and the port matched what you configured in the next section in your NCM. 8. The # cluster manager properties (only configure for cluster manager) # section has been configured on your NCM only. - nifi.cluster.is.manager=true Thanks, Matt
... View more
06-03-2016
03:38 PM
Are these https or http configured cluster NCM and Node(s)?
NCM needs to be able to communicate with the http(s) port and node.protocol port configured in the nifi .properties file on the Node(s).
Node needs to be able to communicate with the cluster manager protocol port configured in the nifi.properties file on the NCM.
Thanks, Matt
... View more
06-02-2016
01:18 PM
1 Kudo
There are a few things you can do here if i am understanding correctly what you are trying to accomplish. 1. The logback.xml can be modified so specific processor component logs could be redirected to a specific new log file. You can specify where that new log is written. You could also specify the specific log level of those components (WARN level would get you just WARN and ERROR messages).
2. In your dataflow you could use the TailFile processor to monitor that new log and route any generated FlowFiles to a putEmail processor to send them to your Admin. In addition to email you can route those FlowFiles to a processor of your choice to put a copy to a specific location as well either locally or remotely. Thanks, Matt
... View more
05-31-2016
02:58 PM
Ahmad, The line you are seeing in the nifi-bootstrap.log indicates the JVM started successfully. You need to check the nifi-app.log to make sure the application loaded successfully. In the nifi-app.log you will find the following lines if the application successfully loaded:
2016-05-31 10:46:44,347 INFO [main] org.apache.nifi.web.server.JettyServer NiFi has started. The UI is available at the following URLs: 2016-05-31 10:46:44,347 INFO [main] org.apache.nifi.web.server.JettyServer http://<someaddress or FQDN>:8088/nifi Verify that the hostname or IP displayed on this line is reachable/resolvable on the system you are running your web browser from.
Thanks, Matt
... View more