Support Questions

Find answers, ask questions, and share your expertise

Approach to collect logs from mappers

avatar
Contributor

I have a YARN map/reduce application.

In mapper I use log4j to log certain cases. After job exection is finished I want to analyze logs from ALL mappers. As there're a lot of mappers in my job, log analysis becomes rather painful job...

Is there a way to write log from mappers to some aggregated file to have all the record in one place? Or probaly there's an approach to combine log files from all mappers from a concrete job?

1 ACCEPTED SOLUTION

avatar
Super Guru

if you are running mapreduce over yarn then enable yarn remote log aggregation(http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/), it will centralized all logging for you and you can perform your analysis over on aggregated logs.

View solution in original post

3 REPLIES 3

avatar
Super Guru

if you are running mapreduce over yarn then enable yarn remote log aggregation(http://hortonworks.com/blog/simplifying-user-logs-management-and-access-in-yarn/), it will centralized all logging for you and you can perform your analysis over on aggregated logs.

avatar
Master Guru

yeah it should be enabled by default though. You would get the log files through the yarn logs command line or you can use pig as well.

https://community.hortonworks.com/articles/33703/mining-tez-app-log-file-with-pig-script.html

avatar
Contributor

Thank you @Rajkumar Singh!

You would only add that it's rather convenient for me to use the following approach:

yarn logs -applicationId application_1465548978834_0004 | grep my.foo.class > /home/user/log2.txt

With this command I can filter all the log entries for the class I want to analyze.