Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

Query JobHistory Server in YARN to obtain data for completed jobs

Contributor

I am looking for the best possible method to gather filesystem counters, job counters and Mapreduce framework details of all the jobs that ran on a specific date. Since upon completion of a job, the logs for the job are stored in HDFS and the information about the job is shipped off to a dedicated server called the JobHistory Server, I am looking at the node that's running Jobhistory server and port 19888 is currently locked down. I am looking for a way to either:

 

1) query HDFS to get data I need, or

2) open the port and use Jobhistory web UI on port 19888

3) other methods

 

CDH v5.1.x We are currently not using Cloudera Manager.

1 ACCEPTED SOLUTION

Contributor
I have discovered that Hadoop JobHistory service has not been running, that explains why no logs have been moved to HDFS. Will have to fix that, and I should have functionality I am looking for.

View solution in original post

1 REPLY 1

Contributor
I have discovered that Hadoop JobHistory service has not been running, that explains why no logs have been moved to HDFS. Will have to fix that, and I should have functionality I am looking for.