Member since
09-29-2015
871
Posts
723
Kudos Received
255
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3372 | 12-03-2018 02:26 PM | |
2318 | 10-16-2018 01:37 PM | |
3636 | 10-03-2018 06:34 PM | |
2411 | 09-05-2018 07:44 PM | |
1835 | 09-05-2018 07:31 PM |
08-11-2016
01:13 PM
1 Kudo
It should be in nifi-app.log... in the code it does: context.getBulletinRepository().addBulletin(bulletin);
logger.warn(message); The logger is a standard SLF4J logger which ends up being logback controlled by the logback.xml in the conf directory: Logger logger = LoggerFactory.getLogger(MonitorDiskUsage.class);
... View more
08-10-2016
01:19 PM
1 Kudo
In MergeContent there is a Delimiter Strategy, choose "Text" which means it uses the values type in to Header, Demarcator, and Footer. The Demarcator is what gets put between each FlowFile that is merged together. You can enter a new line with shift+enter.
... View more
08-10-2016
01:15 PM
It is still considered an unstable beta release so it is not recommended for production, but it is stable enough to run in a test/dev environment. Can't really say a specific timeline, but shouldn't be too far away. The community is already working on remaining issues and anything found from testing the beta.
... View more
08-09-2016
09:12 PM
4 Kudos
In the case of SplitText the approach when splitting large files is to use two instances of SplitText, where the first one might split to 10-20k lines per flow file, and then the second splits down to 1 line. This avoids producing millions of flow files in one execution of the processor. For some other processors it is common for their description to include a warning statement if the processor is going to read in the whole flow file into memory so that the user is aware if they send in 2GB of data, its going to use 2GB of the heap, or create an OOM if its not available. Most processors whenever possible should perform their processing in a streaming fashion to avoid taking up large chunks of memory. As far as sharing the cluster among teams, NiFi doesn't really have resource isolation, but NiFi 1.0.0 (initial BETA released yesterday) is going to introduce is fine grained security model so that different teams and people can be granted access to different parts of the flow. Team1 might only have access to Process Group 1, and Team 2 might only have access to Process Group 2, so each team can't see what the other team is doing or change their flow.
... View more
08-09-2016
08:21 PM
By "reset on restart" I meant that they are held in memory so if the NiFi Java process restarts the counters are reset. Starting/stopping components on the graph do not impact the counters. We don't really do windowing operations... Counters are usually just some processor specific count that could be helpful for debugging/monitoring purposes. It is really just meant for someone to look at in the UI to figure out how something is working, but not really for the processor to retrieve the value later. In fact, I don't think there is any other processor API call besides adjustCounter, so all you can really do is increment. In a cluster I believe you should see the aggregated value of X for the whole cluster, it doesn't break it out for each node. One other point I forgot, is that behind the scenes it automatically keeps track of the aggregate count across instances of the same type of processor, and also for each instance. So if you had two ListenSyslog processors, you should see Messages Received for All Listen Syslog Processors, Messages Received for ListenSyslog #1, and Messages Received for ListenSyslog #2.
... View more
08-09-2016
06:11 PM
2 Kudos
Counters are a way for a processor to track how many times some event occurred, mostly for monitoring purposes. There is a method in the ProcessSession: void adjustCounter(String name, long delta, boolean immediate); So calling this method with ("myCounter", 1, true) would increment the count of "myCounter" by 1, or create the counter if it didn't exist. Counters are not persistent and will be reset on restart. An example is in the syslog processors which increment a counter for each syslog message received.
... View more
08-09-2016
05:23 PM
2 Kudos
The MergeContent processor can be used to merge JSON together and has a property called "Correlation Attribute Name" which when specified will merge together flow files that have the same value for the attribute specified. In your scenario you first need to use EvaluateJSONPath to extract "service" and "eventName" from the JSON document. Based on your sample JSON it seems like they are at the root level of the document so I believe something like: service = $.service
eventName = $.eventName Then you need to get these two values into a single attribute, so you can use UpdateAttribute with something like: serviceEventName = ${service}/${eventName}
Then in MergeContent set the "Correlation Attribute Name" to "serviceEventName". You can also specify the minimum group size and age so that you can merge together either 100MB or 1 hour worth of data.
... View more
08-04-2016
11:14 PM
Hi Stephanie, I'm actually not sure about that one, but I think it has more to do with the Kafka client and the number of partitions. You should be able to have a ConsumeKafka processor running on each node of your NiFi cluster and each pulling data without doing anything special. It might be good to start a new question about this specific problem with ConsumeKafka only consuming data on one node.
... View more
08-04-2016
07:01 PM
A common way to do this is to have the file written to ".filename" first and renamed to "filename" when done. This is why the GetFile processor File Filter property defaults to: [^\\.]\.* That regular expression says any filename that doesn't start with a period. I realize you may not have control over how the files are being written to the directory though, so this may not be an option if you can't control that.
... View more
08-03-2016
08:59 PM
Just to clarify, what Haimo mentioned is that the ConsumerKafka processor does not use any of NiFi's state management capabilities because the Kafka client maintains the offsets. Regarding the Kafka client... in 0.9.0 I believe it no longer stores offsets in ZooKeeper, and now stores them internally somehow, so that is why you see it connecting directly to the broker and not using ZooKeeper.
... View more