- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Which compression is used in Site-to-Site (Remote Process Group)
- Labels:
-
Apache NiFi
-
Cloudera DataFlow (CDF)
Created on ‎10-25-2015 05:07 PM - edited ‎08-19-2019 05:56 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
- Which compression algorithm is used when a remote port communication is set up?
- Can it be customized?
- Does it work on a FlowFile level or the batch that s2s protocol negotiated for transmission?
Created ‎10-25-2015 05:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Site-to-Site uses deflate at level 1 and compresses data in blocks/buffers. With site-to-site a series of 1..N flowfiles are sent at once and ack'd as a group. It is not configurable at this time. Keep in mind of course you can of course compress before sending to s2s and decompress after receiving from s2s using the CompressContent processor.
Do you feel there would be a good bit of value in letting the compression of s2s be configurable? If so would that be for cases like where snappy makes sense because it is certain types of text data?
Thanks
Joe
Created ‎10-25-2015 05:23 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Site-to-Site uses deflate at level 1 and compresses data in blocks/buffers. With site-to-site a series of 1..N flowfiles are sent at once and ack'd as a group. It is not configurable at this time. Keep in mind of course you can of course compress before sending to s2s and decompress after receiving from s2s using the CompressContent processor.
Do you feel there would be a good bit of value in letting the compression of s2s be configurable? If so would that be for cases like where snappy makes sense because it is certain types of text data?
Thanks
Joe
Created ‎10-25-2015 05:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Yes, Joe, I had something like snappy in my mind as a good middle ground between size and performance.
As a minimum, a compression level property should be exposed to the operator to balance an existing compression protocol between speeed/cpu load and network traffic volume.
