I have an uncommen question: What is Tez in general? I've read a lot the last few days (including the offical paper and the several descriptions on the Hortonworks/Tez/... websites), and I don't get the point.
So far I have understood, that it's an improvement towards MR, because it's offering DAGs so that HDFS-writes can be avoided. It is also more an interface for tools like Pig and Hive and not for application-developers and you should better use Spark for DAG-related applications. Why exactly?!
And how is Tez working? How are the DAGs executed? I've read several times that it's more a task-executor than an engine. Facing this statement, I'm asking myself which executor is used? MR? There wasn't written anything to this fact. Additionally, in diagrams of Tez in cluster-architectures, there is Tez below MR, Spark and other engines. Or did I misunderstand this completely and there is an engine in the background of Tez?
Would be great if someone could bring light into the dark of my understandings.
Yes, Tez is a Exectuion Engine. I would recommend you to go through the below links which has the answers to most your queries.