Member since
09-29-2015
871
Posts
723
Kudos Received
255
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3372 | 12-03-2018 02:26 PM | |
2318 | 10-16-2018 01:37 PM | |
3636 | 10-03-2018 06:34 PM | |
2411 | 09-05-2018 07:44 PM | |
1835 | 09-05-2018 07:31 PM |
08-16-2016
05:47 PM
1 Kudo
I don't have much experience with the site-to-site implementation, but seems like it wouldn't be too difficult to support adding the transit.uri as an attribute when receiving flow files over site-to-site (if thats all we are talking about): https://github.com/apache/nifi/blob/e23b2356172e128086585fe2c425523c3628d0e7/nifi-nar-bundles/nifi-framework-bundle/nifi-framework/nifi-site-to-site/src/main/java/org/apache/nifi/remote/protocol/AbstractFlowFileServerProtocol.java#L445 Alternatively, maybe minifi-cpp should have the ability to send metadata since NiFi already supports receiving attributes over site-to-site.
... View more
08-16-2016
01:59 PM
2 Kudos
In general, concurrent tasks is the number of threads calling onTrigger for an instance of a processors. In a cluster, if you set concurrent tasks to 4, then it is 4 threads on each node of your cluster. I am not as familiar with all the ins and outs of the kafka processors, but for GetKafka it does something like this: int concurrentTaskToUse = context.getMaxConcurrentTasks();
final Map<String, Integer> topicCountMap = new HashMap<>(1);
topicCountMap.put(topic, concurrentTaskToUse);
final Map<String, List<KafkaStream<byte[], byte[]>>> consumerMap = consumer.createMessageStreams(topicCountMap); The consumer is from the Kafka 0.8 client, so it is creating a message-stream for each concurrent task. Then when the processor is triggered it takes one of those message-streams and consumes a message, and since multiple concurrent tasks are trigger the processor, it is consuming from each of those streams in parallel. As far as rebalancing, I think the Kafka client does that transparently to NiFi, but I am not totally sure. Messages that have already been pulled into a NiFi node will stay there until the node is back up and processing.
... View more
08-16-2016
01:44 PM
3 Kudos
I think it depends what you mean by "schedule it to run every hour"... NiFi itself would always be running and different processors can be scheduled to run according to their needs. Every processor supports timer based scheduling or cron based scheduling, so using either of those you can set a source processor to run every hour. You could also use the REST API to start and stop processors as needed, anything you can do in the UI can be done through the REST API. For best practices for upgrading NiFi see this wiki page: https://cwiki.apache.org/confluence/display/NIFI/Upgrading+NiFi Deploying changes to production, there are a couple of approaches, one of them is based around templates: https://github.com/aperepel/nifi-api-deploy Some people also just move the flow.xml.gz from one environment to another, but this assumes you have parametized everything that is different between environments.
... View more
08-15-2016
05:56 PM
In that scenario there is always going to be something you have to set that is specific to the user. I think the best approach might be to use the REST API to change the value of GetFile's directory after the user imported the template and set it to the user's input directory.
... View more
08-15-2016
03:21 PM
Is each user importing the template into a separate NiFi instance? or is there one NiFi instance with multiple users who all import the same template and want to retrieve files from different directories?
... View more
08-15-2016
02:43 PM
1 Kudo
The Input Directory property of GetFile supports Expression Language so you can reference a system property like ${my.directory} and define my.directory in bootstrap.conf by adding another java arg like: java.arg.15=-Dmy.directory=/foo Then you can have different bootstrap files per environment.
... View more
08-15-2016
02:15 PM
2 Kudos
The ListFile processor keeps track of files it has already seen and only picks up files where the modified date is newer than the last time the processor ran. ListFile produces a flow file for each path to fetch and is used with FetchFile to actually retrieve the file.
... View more
08-13-2016
06:14 PM
1 Kudo
You can provide additional dependencies to the ExecuteScript processor by using the "Module Directory" property as described here: http://funnifi.blogspot.com/2016/02/executescript-using-modules.html You generally shouldn't put any jars into NiFi's lib directory because that can impact all other NARs.
... View more
08-11-2016
05:45 PM
1 Kudo
Yes thats what i was trying to say about it being the name of an attribute, and not the attribute itself. When you put ${correlation.id} the framework evaluates that first, in your case it ends up being something like 20121021, and then MergeContent goes to look for an attribute called "20121021" which doesn't exist.
... View more
08-11-2016
01:21 PM
4 Kudos
This answer is correct, just wanted to add additional clarification... The "Correlation Attribute Name" is not the actual value to correlate on, its the name of an attribute that has the value to correlate on. So as suggested, you could use an UpdateAttribute processor to create an attribute like: correlation.id = ${filename:substring(5,13)} Then in MergeContent put correlation.id as the value of Correlation Attribute Name.
... View more