Created 11-10-2016 11:59 PM
I'm using Hortonworks' sandbox VM to run Hadoop services. I executed a Pig filter script in Tez mode. Unlike Hive, the Pig log (console) doesn't show any information about the number of mappers & reducers being executed. Am I looking at a wrong place?
Created on 11-11-2016 02:17 PM - edited 08-18-2019 03:33 AM
For tez "tasks" represent map operations or reduce operations. A DAG is a full workflow (job) of vertices (processing of tasks) and edges (data movement between vertices).
See these links for a more detailed discussion:
http://hortonworks.com/blog/expressing-data-processing-in-apache-tez/
https://community.hortonworks.com/questions/32164/question-on-tez-dag-task-and-pig-on-tez.html
https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works
You can see number of tasks on the console output:
You can also see this in Ambari Tez view (and drill down for greater details)
See this for understanding Ambari Tez view: https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.0/bk_ambari_views_guide/content/section_using...
Created 11-11-2016 12:05 AM
If you verified your output for Hive via Hive/Beeline shell, then its a different story, you are actually seeing the output on STDOUT. With Pig, you can try using pig view instead.
Ambari -> admin (drop down) -> Manage Ambari -> Views -> PIG -> Create Instance (If you don't have a PIG view already)
Created 11-11-2016 12:13 AM
Thank your for your response but I had done that already. I executed the Pig script both via terminal (pig -x tez script.pig)& from Ambari/Pig view. Also when I expanded the logs section in Pig view it was very similar too the one which I had in terminal (console). I don't see any mappers nor reducer counts. 😞
Created on 11-11-2016 02:17 PM - edited 08-18-2019 03:33 AM
For tez "tasks" represent map operations or reduce operations. A DAG is a full workflow (job) of vertices (processing of tasks) and edges (data movement between vertices).
See these links for a more detailed discussion:
http://hortonworks.com/blog/expressing-data-processing-in-apache-tez/
https://community.hortonworks.com/questions/32164/question-on-tez-dag-task-and-pig-on-tez.html
https://cwiki.apache.org/confluence/display/TEZ/How+initial+task+parallelism+works
You can see number of tasks on the console output:
You can also see this in Ambari Tez view (and drill down for greater details)
See this for understanding Ambari Tez view: https://docs.hortonworks.com/HDPDocuments/Ambari-2.1.2.0/bk_ambari_views_guide/content/section_using...