Created 06-10-2016 05:39 PM
I can't find the log files from my MapReduce jobs. I'm using MR2 in HortonWorks 2.4.3 sandbox I got from here.
In an effort to try to create the logs in one directory, I have set the following environment variables
export HADOOP_MAPRED_HOME=/home/hadoop export HADOOP_YARN_HOME=/hadoop/yarn export YARN_LOG_DIR=/hadoop/yarn/log export HADOOP_LOG_DIR=/hadoop/yarn/log export HADOOP_MAPRED_LOG_DIR=/hadoop/yarn/log
I'm not sure if setting these in my environment session has any effect when running the job. Presumably I have to set this in Ambari?
I do see the job history logs in /var/log/hadoop-mapreduce/mapred/. But I don't see the logs from my map reduce program itself. Following the link in Ambari for MapReduce JobHistory UI takes me to http://<>:19888/jobhistory which shows no jobs.
I have tried starting my mapreduce job using
yarn jar ./lib/<my>.jar <mapreduce driver class name> <input file name> <output hdfs dir name> <properties file>
and
hadoop jar ./lib/<my>.jar <mapreduce driver class name> <input file name> <output hdfs dir name> <properties file>
Same result with both. Except with the second I get a warning message
WARNING: Use "yarn jar" to launch YARN applications.
I see nothing in the resource manager (http://<>:8088) or node manager (http://<>:8042) UI
According to Simplifying user-logs management and access in YARN, I should be using Application Ids with the yarn commands. But where are these Application Ids set when I invoke the Map Reduce program as I do above?
Secondly how am I setting properties like yarn.nodemanager.log-dirs By default it's ${yarn.log.dir}/userlogs Where is ${yarn.log.dir} set?
I think this is what I'm missing (along with how to set/get an application Id from a MapReduce program).
I think I'm missing something obvious so a nudge in the right direction would be appreciated.
Created 06-10-2016 11:17 PM
SOLVED!
I was using hadoop dependency of 1.2.1. Even after I changed the dependency to 2.2.0, I was still running into issues. I made some other changes but they don't seem to be relevant.
I wasn't getting these lines earlier.
16/06/10 23:10:08 INFO mapreduce.JobSubmitter: number of splits:1 16/06/10 23:10:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1465582268359_0017 16/06/10 23:10:09 INFO impl.YarnClientImpl: Submitted application application_1465582268359_0017 16/06/10 23:10:09 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1465582268359_0017/
I wrote the sample app. And when that worked, just started plugging away at the differences with my app and realized that my hadoop version was incorrect.
Created 06-10-2016 05:48 PM
can you check with yarn logs -applicationId <application ID>
Created 06-10-2016 06:13 PM
Rajkumar, that's the problem I have. Where do I get the Application Id from? Is there some different command that I need to use to start the MapReduce job or some flags I need to use? The same job used to work in MapR hadoop 1 with the logs going to the syslog directory. In Ambari, I'm seeing no applications at all.
Created 11-30-2017 02:26 PM
Hi Milind Rao. You can use Ambari -> YARN -> Quick Links -> ResourcesManager UI for ApplicationId
Created 06-10-2016 11:17 PM
SOLVED!
I was using hadoop dependency of 1.2.1. Even after I changed the dependency to 2.2.0, I was still running into issues. I made some other changes but they don't seem to be relevant.
I wasn't getting these lines earlier.
16/06/10 23:10:08 INFO mapreduce.JobSubmitter: number of splits:1 16/06/10 23:10:08 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1465582268359_0017 16/06/10 23:10:09 INFO impl.YarnClientImpl: Submitted application application_1465582268359_0017 16/06/10 23:10:09 INFO mapreduce.Job: The url to track the job: http://sandbox.hortonworks.com:8088/proxy/application_1465582268359_0017/
I wrote the sample app. And when that worked, just started plugging away at the differences with my app and realized that my hadoop version was incorrect.