Support Questions

Find answers, ask questions, and share your expertise

Query JobHistory Server in YARN to obtain data for completed jobs

avatar
Contributor

I am looking for the best possible method to gather filesystem counters, job counters and Mapreduce framework details of all the jobs that ran on a specific date. Since upon completion of a job, the logs for the job are stored in HDFS and the information about the job is shipped off to a dedicated server called the JobHistory Server, I am looking at the node that's running Jobhistory server and port 19888 is currently locked down. I am looking for a way to either:

 

1) query HDFS to get data I need, or

2) open the port and use Jobhistory web UI on port 19888

3) other methods

 

CDH v5.1.x We are currently not using Cloudera Manager.

1 ACCEPTED SOLUTION

avatar
Contributor
I have discovered that Hadoop JobHistory service has not been running, that explains why no logs have been moved to HDFS. Will have to fix that, and I should have functionality I am looking for.

View solution in original post

1 REPLY 1

avatar
Contributor
I have discovered that Hadoop JobHistory service has not been running, that explains why no logs have been moved to HDFS. Will have to fix that, and I should have functionality I am looking for.