Created 01-22-2016 11:22 PM
Can someone please help me understand the difference between Google Cloud DataFlow and Hortonworks DataFlow. Are there technical differences I should be aware of?
Created 01-22-2016 11:47 PM
Google Cloud Dataflow is a service which replaces MapReduce processing, and is designed strictly for the Google Compute Cloud. Whereas Hortonworks Dataflow is a product aiming to solve data flow problems, even outside of data center.
So the answer is no, they are essentially using similar names to describe very different things. One is sitting in the cloud waiting for data to be delivered to it; and the other one delivers data to all kinds of processing systems: Google Dataflow, Storm, Spark, etc.
Created 01-22-2016 11:47 PM
Google Cloud Dataflow is a service which replaces MapReduce processing, and is designed strictly for the Google Compute Cloud. Whereas Hortonworks Dataflow is a product aiming to solve data flow problems, even outside of data center.
So the answer is no, they are essentially using similar names to describe very different things. One is sitting in the cloud waiting for data to be delivered to it; and the other one delivers data to all kinds of processing systems: Google Dataflow, Storm, Spark, etc.
Created 01-23-2016 02:04 AM
Google Dataflow is a language framework for multiple engines like Spark, Flink and mapreduce. Hadoop Data Flow is a data in motion processing tool with a visual editor.
Created 02-04-2016 01:36 PM
More an info than an answer: proposal to join the apache incubator:
acceptance: