What is difference between Hadoop and Spark?

I think hadoop and spark both are big data framework, so why Spark is killing Hadoop? what is the the difference between hadoop and spark.


Hadoop and spark are two different frameworks. On a very high level. Hadoop is a storage layer and Spark is a processing engine.

You can store the data in HDFS and run the aggregations using spark on top of that data.


@Pritam Pal, Hadoop is a combination of HDFS ( data storage), YARN (app execution framework) And Mapreduce ( data processing engine). Thus, it is not fair to compare Hadoop and Spark. Mapreduce and Spark can be comparable because both of them are data processing engine.

Here is a good link which compares Mapreduce and Spark in detail

  • Stores data in local disk
  • Slow speed
  • Suitable for batch processing
  • External schedulers required
  • High latency
  • No in-built interactive mode.
  • less expensive hardware
  • very difficult to work.


  • Stores data in-memory
  • Faster speed
  • Suitable for batch and real-time processing
  • Schedules tasks itself
  • Low latency
  • Has interactive mode
  • Lot of RAM to run in-memory, increasing it in the cluster, gradually increases its cost.
  • It is easy to program
