Member since
02-26-2017
19
Posts
0
Kudos Received
0
Solutions
05-15-2017
06:52 PM
1 Kudo
@yeah thatguy 10K FlowFiles in NiFi is nothing in terms of load. NiFi processors use system threads to run. These processors can be configured with multiple "concurrent tasks". This allows one processor to essentially run multiple times at the exact same time. I would not however ever try to schedule one processor with 10,000 concurrent tasks (I don't know of any server that has 10,000 cpu cores.) Can you elaborate on your use case and why you must load all 10k files in parallel versus rapid succession? Processors are designed in a variety of ways depending on their function. Some processor work on one FlowFile at a time while other work on batches of FlowFiles. GetFile has a configurable BatchSize which controls the number for Files retrieved per processor execution. All Files are committed as FlowFile in nifi at the same time upon ingestion. You could configure smaller batches and multiple concurrent tasks on this processor. ListFile processor retrieve a complete listing of all Files in the target directory and then creates a single 0 byte FlowFile for each of them. The complete batch is committed to the success relationship at the same time. FetchFile processor retrieves the content of each of the listed files and inserts that content in to the FlowFile. This processor is a good candidate for multiple concurrent tasks. Each instance of NiFi runs in its own single JVM. Only FlowFile attributes live in JVM heap memory (FlowFile attributes are also persisted to disk). To help protect the JVM from OOM errors NiFi will swap FlowFiles to disk if a connections queue exceeds the configurable swapping threshold. The default swapping threshold is 20,000 and is set in the nifi.properties file. This setting is per connection and not for the entire NiFi dataflow(s). FlowFile Content is written to the NiFi content repository. It is then only accessed when a processor performs a function that requires it to read or modify that content. NiFi's JVM heap memory defaults to only 512 MB, but is configurable via NiFi's bootstrap.conf file. Thanks, Matt
... View more
03-04-2017
10:18 AM
1 Kudo
Hi @yeah thatguy, You would have more details about the error by looking at the log file ./logs/nifi-app.log The most common cause is that the port you defined in the processor is already used and cannot be used by the listener started by the processor. Hope this helps.
... View more
02-27-2017
06:19 PM
@dsun Thanks. ExecuteStreamCommand is the processor which we are going to make use of. Nifi role will be til creating a data mart from all source and oozie will take care of the rest of the flow.
... View more