Support Questions
Find answers, ask questions, and share your expertise

Difference between HADOOP and YARN job

Expert Contributor


this may be a simple question. what is difference between the below two :

1) yarn jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi 10 10000;

2) hadoop jar /usr/hdp/current/hadoop-mapreduce-client/hadoop-mapreduce-examples.jar pi 10 10000;

I did run both and compared the logs, still same. In fact both took same amount of time.




Al you are doing is running MR code through yarn and hadoop so it looks the same.

Please try starting a spark job from hadoop jar. hadoop job is a subset of Yarn jobs.

Super Collaborator

But you use spark-submit for a Spark Job, not "hadoop/yarn jar"

spark-submit has YARN client wrapped within it, as spark-submit when executed in YARN mode, request yarn to start an AM and then request for containers where the SPARK ETL is executed.

Super Collaborator
; ;