Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

NiFi - Use cases with low bandwidth

avatar
Master Guru

Does NiFi have any communication protocalls which would help increase performance on network with low bandwidth?

Use case example:

Moving data from FTP server (cali) to data center in New York. Performance is terrible due to bandwidth.

Can NiFi Help? if so how?

1 ACCEPTED SOLUTION

avatar
Master Mentor

@Sunile Manjee that's one of the strengths of nifi it has prioritization of events so you can specify what data is most important and send it through first. So short answer yes as nifi was designed to work in 3rd world countries where connectivity is luxury. Every connector has a few different priority rules and you can set attributes on data to treat what events are most important.

View solution in original post

12 REPLIES 12

avatar
Master Mentor

@Sunile Manjee that's one of the strengths of nifi it has prioritization of events so you can specify what data is most important and send it through first. So short answer yes as nifi was designed to work in 3rd world countries where connectivity is luxury. Every connector has a few different priority rules and you can set attributes on data to treat what events are most important.

avatar
Master Guru

How would that help on a single file which is not competing against other resources?

avatar
Master Mentor

If network is unavailable no nifi trick can help but as soon as there's connectivity if that file is marked as highest priority, it will take precedence.

avatar
Master Guru

if there is connectivity and only 1 file, how will nifi performam for file transferring to data center better and other transfer methods? this is pure case of file not competing for resources. It is simply if nifi can outperform compared against other methods. Am i making sense?

avatar
Master Mentor

@Sunile Manjee yes absolutely, there are many knobs to tweak, manipulate the file so that only important data is sent, compress, then you can load balance nifi so then when you split file into smaller chunks and compress then can be routed through multiple nifi clusters. There are file, architecture and pipeline tricks you can do. Maybe you'd like to specify what product you're positioning nifi against so we could concentrate on pros and cons of that?

avatar
Master Guru

@Artem Ervits The tool I am postioning against is a simple python script will pulls 10gig file daily (nightly) from STFP server. It takes several hours to get data from SFTP into data center. They have identified it is network bandwidth between SFTP location and data center. Can NiFi help and speed up load process?

avatar
Master Mentor

yes, depending how you wrote the script, if it's not processing the file asynchronously then Nifi can help there, you can chunk the file into smaller and compressed files and route it through different nifi nodes to get concurrency. Just the fact that you need to write a script vs drag and drop processors is a benefit. Picture this, you wrote Python script, then you quit, who will manage it then? Do you have other Python developers there? I know in your case it's a simple script but problem escalates when you have considerable amount of code invested and nobody to support. @Sunile Manjee

avatar
Super Mentor

NiFi supports compression which can decrease the size of files being transferred across the network. NiFi can split large files in to smaller files which can be reassembled back in to the original larger files by a NiFi on the other side of the transfer. Those split files could be sent via multiple concurrent threads. If network issue occurs, entire file transfer does not start over, just that one small piece. NiFi could be used to remove unneeded portions of the content that does not need to be transferred (think system logs where some log lines have no value. Those log lines could be removed from the the larger log file reducing it size before being transferred).

avatar
Master Guru

@mclark That is GREAT info. Now where can I find this official documentation on that so it may be shared with customer?