Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

Query JobHistory Server in YARN to obtain data for completed jobs

avatar
Contributor

I am looking for the best possible method to gather filesystem counters, job counters and Mapreduce framework details of all the jobs that ran on a specific date. Since upon completion of a job, the logs for the job are stored in HDFS and the information about the job is shipped off to a dedicated server called the JobHistory Server, I am looking at the node that's running Jobhistory server and port 19888 is currently locked down. I am looking for a way to either:

 

1) query HDFS to get data I need, or

2) open the port and use Jobhistory web UI on port 19888

3) other methods

 

CDH v5.1.x We are currently not using Cloudera Manager.

1 ACCEPTED SOLUTION

avatar
Contributor
I have discovered that Hadoop JobHistory service has not been running, that explains why no logs have been moved to HDFS. Will have to fix that, and I should have functionality I am looking for.

View solution in original post

1 REPLY 1

avatar
Contributor
I have discovered that Hadoop JobHistory service has not been running, that explains why no logs have been moved to HDFS. Will have to fix that, and I should have functionality I am looking for.