Created on 09-05-2018 06:19 PM - edited 08-18-2019 01:55 AM
Current process: Click a page > Wait > Copy table to Excel > Repeat (tedious!)
I use the information for some summary metrics about usage of our database and error checking on batch queries.
Note: Since Tez UI Ambari view is going away in 2.7 (reference link), I would REALLY like to find a way to get at this information programatically.
<Screenshot of Tez Hive Queries screen attached>
Created 09-07-2018 02:24 PM
http://<ats>:8188/ws/v1/timeline/HIVE_QUERY_ID http://<ats>:8188/ws/v1/timeline/TEZ_DAG_ID
@Gary Whiteford, You can use the above api calls to fetch the tez application details directly from the Application timeline server(ATS).
Hope this helps.
Created 09-07-2018 02:37 PM
Tez actually ships with Pig loader to mine Tez logs, you can find the details of it at https://github.com/apache/tez/tree/master/tez-tools/tez-tfile-parser
Here's a sample
set pig.splitCombination false; set tez.grouping.min-size 52428800; set tez.grouping.max-size 52428800; /* Register all tez jars. Replace $TEZ_HOME, $TEZ_TFILE_DIR with absolute path */ register '$TEZ_HOME/*.jar'; register '$TEZ_TFILE_DIR/tfile-parser-1.0-SNAPSHOT.jar'; raw = load '/app-logs/root/logs/application_1411511669099_0769/*' using org.apache.tez.tools.TFileLoader() as (machine:chararray, key:chararray, line:chararray); filterByLine = FILTER raw BY (key MATCHES '.*container_1411511669099_0769_01_000001.*') AND (line MATCHES '.*Shuffle.*'); dump filterByLine;
Created 09-07-2018 07:43 PM
Thank you both. This makes me very hopeful.
May I ask for one more layer of context (since working with APIs and Pig scripts is new to me)?
Something along the lines of...
Thanks for your patience with a neophyte. :)