Support Questions
Find answers, ask questions, and share your expertise

What is difference between Hadoop and Spark?

New Contributor

I think hadoop and spark both are big data framework, so why Spark is killing Hadoop? what is the the difference between hadoop and spark.

3 REPLIES 3

Re: What is difference between Hadoop and Spark?

@Pritam Pal

Hadoop and spark are two different frameworks. On a very high level. Hadoop is a storage layer and Spark is a processing engine.

You can store the data in HDFS and run the aggregations using spark on top of that data.

Re: What is difference between Hadoop and Spark?

Guru

@Pritam Pal, Hadoop is a combination of HDFS ( data storage), YARN (app execution framework) And Mapreduce ( data processing engine). Thus, it is not fair to compare Hadoop and Spark. Mapreduce and Spark can be comparable because both of them are data processing engine.

Here is a good link which compares Mapreduce and Spark in detail

https://www.xplenty.com/blog/apache-spark-vs-hadoop-mapreduce/

Re: What is difference between Hadoop and Spark?

New Contributor

Hadoop

  • Stores data in local disk
  • Slow speed
  • Suitable for batch processing
  • External schedulers required
  • High latency
  • No in-built interactive mode.
  • less expensive hardware
  • very difficult to work.

Spark

  • Stores data in-memory
  • Faster speed
  • Suitable for batch and real-time processing
  • Schedules tasks itself
  • Low latency
  • Has interactive mode
  • Lot of RAM to run in-memory, increasing it in the cluster, gradually increases its cost.
  • It is easy to program