07-04-2017 08:38 AM
High throughput is easy to achieve. Low latency can be harder. You are talking about throughput here, but latency above. You may want to clarify which of those is important.
No, you can easily recover from failures in SS, and reprocess data that wasn't processed due to a failure. In some situations you can even get exactly-once semantics. Kafka is something you use with SS, not really instead of.
Currently incubating in Cloudera Labs:Envelope