Created 06-14-2016 03:09 PM
We are using logstash as centralized storage location for all our logs, including the hadoop logs. So far we manged to change log4j in most components to send logs to our logstash cluster additionally. (This includes Kafka, Hive, Ambari, MapReduce2, Yarn etc etc...)
Our approach was to add an additional log4j appender in the components log4j.properties file and extend the root.logger with the given appender. We would have liked to exchange the log4j logger completely for a logback logger, but that includes a lot more work than living with the drawbacks of log4j is worth for now. (if someone has an easy idea how to solve this, I would be super grateful).
My actual problem is, that this approach is not working with HDFS.
For some reason the changes we apply in the hdfs - log4j.properties are neglected. (we use ambari webinterface to configure this.)
The other appenders still work fine, but the logstash appender is being neglected completly. Do we have to consider any log4j settings in HDFS that do not exist for the other services?
What bugs me is that all other services worked like a charm, just HDFS is blocking.... 😞
Thanks for any help in advance.
Our setup:
HDP: HDP-2.3.4.0-3485
HDFS: 2.7.1.2.3
hadoop.root.logger = INFO,RFA,logstash
hdfs-log4j.properties
# # Licensed to the Apache Software Foundation (ASF) under one # or more contributor license agreements. See the NOTICE file # distributed with this work for additional information # regarding copyright ownership. The ASF licenses this file # to you under the Apache License, Version 2.0 (the # "License"); you may not use this file except in compliance # with the License. You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, # software distributed under the License is distributed on an # "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY # KIND, either express or implied. See the License for the # specific language governing permissions and limitations # under the License. # # Define some default values that can be overridden by system properties # To change daemon root logger use hadoop_root_logger in hadoop-env hadoop.root.logger=INFO,console,logstash #hadoop.log.dir=. hadoop.log.dir=/mnt/storage/log/hadoop hadoop.log.file=hadoop.log # #Logstash Appender # log4j.appender.logstash=org.apache.log4j.net.SocketAppender log4j.appender.logstash.Port=<Logstash Port> log4j.appender.logstash.RemoteHost=<Logstash IP> log4j.appender.logstash.ReconnectionDelay=10000 log4j.appender.logstash.Application=hdfs # Define the root logger to the system property "hadoop.root.logger". log4j.rootLogger=${hadoop.root.logger}, EventCounter # Logging Threshold log4j.threshhold=ALL # # Daily Rolling File Appender # log4j.appender.DRFA=org.apache.log4j.DailyRollingFileAppender log4j.appender.DRFA.File=${hadoop.log.dir}/${hadoop.log.file} # Rollver at midnight log4j.appender.DRFA.DatePattern=.yyyy-MM-dd # 30-day backup #log4j.appender.DRFA.MaxBackupIndex=30 log4j.appender.DRFA.layout=org.apache.log4j.PatternLayout # Pattern format: Date LogLevel LoggerName LogMessage log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n # Debugging Pattern format #log4j.appender.DRFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n # # console # Add "console" to rootlogger above if you want to use this # log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n # # TaskLog Appender # #Default values hadoop.tasklog.taskid=null hadoop.tasklog.iscleanup=false hadoop.tasklog.noKeepSplits=4 hadoop.tasklog.totalLogFileSize=100 hadoop.tasklog.purgeLogSplits=true hadoop.tasklog.logsRetainHours=12 log4j.appender.TLA=org.apache.hadoop.mapred.TaskLogAppender log4j.appender.TLA.taskId=${hadoop.tasklog.taskid} log4j.appender.TLA.isCleanup=${hadoop.tasklog.iscleanup} log4j.appender.TLA.totalLogFileSize=${hadoop.tasklog.totalLogFileSize} log4j.appender.TLA.layout=org.apache.log4j.PatternLayout log4j.appender.TLA.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n # #Security audit appender # hadoop.security.logger=INFO,logstash,console hadoop.security.log.maxfilesize=256MB hadoop.security.log.maxbackupindex=20 log4j.category.SecurityLogger=${hadoop.security.logger} hadoop.security.log.file=SecurityAuth.audit log4j.appender.DRFAS=org.apache.log4j.DailyRollingFileAppender log4j.appender.DRFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} log4j.appender.DRFAS.layout=org.apache.log4j.PatternLayout log4j.appender.DRFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n log4j.appender.DRFAS.DatePattern=.yyyy-MM-dd log4j.appender.RFAS=org.apache.log4j.RollingFileAppender log4j.appender.RFAS.File=${hadoop.log.dir}/${hadoop.security.log.file} log4j.appender.RFAS.layout=org.apache.log4j.PatternLayout log4j.appender.RFAS.layout.ConversionPattern=%d{ISO8601} %p %c: %m%n log4j.appender.RFAS.MaxFileSize=${hadoop.security.log.maxfilesize} log4j.appender.RFAS.MaxBackupIndex=${hadoop.security.log.maxbackupindex} # # hdfs audit logging # hdfs.audit.logger=INFO,logstash,console log4j.logger.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=${hdfs.audit.logger} log4j.additivity.org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit=false log4j.appender.DRFAAUDIT=org.apache.log4j.DailyRollingFileAppender log4j.appender.DRFAAUDIT.File=${hadoop.log.dir}/hdfs-audit.log log4j.appender.DRFAAUDIT.layout=org.apache.log4j.PatternLayout log4j.appender.DRFAAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n log4j.appender.DRFAAUDIT.DatePattern=.yyyy-MM-dd # # NameNode metrics logging. # The default is to retain two namenode-metrics.log files up to 64MB each. # namenode.metrics.logger=INFO,logstash,NullAppender log4j.logger.NameNodeMetricsLog=${namenode.metrics.logger} log4j.additivity.NameNodeMetricsLog=false log4j.appender.NNMETRICSRFA=org.apache.log4j.RollingFileAppender log4j.appender.NNMETRICSRFA.File=${hadoop.log.dir}/namenode-metrics.log log4j.appender.NNMETRICSRFA.layout=org.apache.log4j.PatternLayout log4j.appender.NNMETRICSRFA.layout.ConversionPattern=%d{ISO8601} %m%n log4j.appender.NNMETRICSRFA.MaxBackupIndex=1 log4j.appender.NNMETRICSRFA.MaxFileSize=64MB # # mapred audit logging # mapred.audit.logger=INFO,logstash,console log4j.logger.org.apache.hadoop.mapred.AuditLogger=${mapred.audit.logger} log4j.additivity.org.apache.hadoop.mapred.AuditLogger=false log4j.appender.MRAUDIT=org.apache.log4j.DailyRollingFileAppender log4j.appender.MRAUDIT.File=${hadoop.log.dir}/mapred-audit.log log4j.appender.MRAUDIT.layout=org.apache.log4j.PatternLayout log4j.appender.MRAUDIT.layout.ConversionPattern=%d{ISO8601} %p %c{2}: %m%n log4j.appender.MRAUDIT.DatePattern=.yyyy-MM-dd # # Rolling File Appender # log4j.appender.RFA=org.apache.log4j.RollingFileAppender log4j.appender.RFA.File=${hadoop.log.dir}/${hadoop.log.file} # Logfile size and and 30-day backups log4j.appender.RFA.MaxFileSize=256MB log4j.appender.RFA.MaxBackupIndex=10 log4j.appender.RFA.layout=org.apache.log4j.PatternLayout log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} - %m%n log4j.appender.RFA.layout.ConversionPattern=%d{ISO8601} %-5p %c{2} (%F:%M(%L)) - %m%n # Custom Logging levels hadoop.metrics.log.level=INFO,logstash #log4j.logger.org.apache.hadoop.mapred.JobTracker=DEBUG #log4j.logger.org.apache.hadoop.mapred.TaskTracker=DEBUG #log4j.logger.org.apache.hadoop.fs.FSNamesystem=DEBUG log4j.logger.org.apache.hadoop.metrics2=${hadoop.metrics.log.level} # Jets3t library log4j.logger.org.jets3t.service.impl.rest.httpclient.RestS3Service=ERROR # # Null Appender # Trap security logger on the hadoop client side # log4j.appender.NullAppender=org.apache.log4j.varia.NullAppender # # Event Counter Appender # Sends counts of logging messages at different severity levels to Hadoop Metrics. # log4j.appender.EventCounter=org.apache.hadoop.log.metrics.EventCounter # Removes "deprecated" messages log4j.logger.org.apache.hadoop.conf.Configuration.deprecation=WARN # # HDFS block state change log from block manager # # Uncomment the following to suppress normal block state change # messages from BlockManager in NameNode. #log4j.logger.BlockStateChange=WARN
Created 08-30-2016 01:30 PM
Ok.... I know what was the issue.
The log4j.property files from yarn and hdfs are actually written into the same file on the operating system. Thus I always overwrote my previous logstash appender with the name of the previous one. Giving distinct names to each logstash appender solved the issue. It is weird that these configs are split in Ambari though...
Created 06-14-2016 05:29 PM
hadoop.root.logger=INFO,console,logstash
If you have the above setting in log4j.properties, then it acts as the default if the process launch does not pass -Dhadoop.root.logger. However, if the argument is passed, then that argument acts as an override, and the value specified in log4j.properties is ignored.
One way to check if this is happening is to look at the process table, such as by running this:
ps auxwww | grep NameNode | grep 'hadoop.root.logger'
Then, look at the full command line of the NameNode process. If it does not include your logstash appender, then this is the likely explanation for what you are seeing.
To change the arguments passed at launch of the NameNode, edit hadoop-env.sh and find the variable HADOOP_NAMENODE_OPTS. Within the value for that environment variable, you can add the setting for -Dhadoop.root.logger=INFO,console,logstash. If you add your logstash appender there and restart the NameNode, then it will pass that argument value down to the new process launch, and I expect it will activate your logstash appender.
Also, the more common setting in HDP deployments is INFO,RFA, so it might be more appropriate for you to set the value to -Dhadoop.root.logger=INFO,RFA,logstash.
Created 06-15-2016 08:38 AM
Thanks a lot for sharing your ideas @Chris Nauroth!
Your ideas helped me a bit understanding the logging system here.
I extended the hadoop-env template field in ambari with the following command:
export HADOOP_ROOT_LOGGER="INFO,RFA,logstash"
This results in a namenode command call like this:
/usr/jdk64/jdk1.8.0_60/bin/java -Dproc_namenode -Xmx1024m -Dhdp.version=2.3.4.0-3485 -Djava.net.preferIPv4Stack=true -Dhdp.version= -Djava.net.preferIPv4Stack=true -Dhdp.version= -Djava.net.preferIPv4Stack=true -Dhadoop.log.dir=/mnt/storage/log/hadoop/hdfs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/usr/hdp/2.3.4.0-3485/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA,logstash -Djava.library.path=:/usr/hdp/2.3.4.0-3485/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.3.4.0-3485/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -Dhdp.version=2.3.4.0-3485 -Dhadoop.log.dir=/mnt/storage/log/hadoop/hdfs -Dhadoop.log.file=hadoop-hdfs-namenode-i-fd561d77.log -Dhadoop.home.dir=/usr/hdp/2.3.4.0-3485/hadoop -Dhadoop.id.str=hdfs -Dhadoop.root.logger=INFO,RFA,logstash -Djava.library.path=:/usr/hdp/2.3.4.0-3485/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.3.4.0-3485/hadoop/lib/native:/usr/hdp/2.3.4.0-3485/hadoop/lib/native/Linux-amd64-64:/usr/hdp/2.3.4.0-3485/hadoop/lib/native -Dhadoop.policy.file=hadoop-policy.xml -Djava.net.preferIPv4Stack=true -server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/mnt/storage/log/hadoop/hdfs/hs_err_pid%p.log -XX:NewSize=128m -XX:MaxNewSize=128m -Xloggc:/mnt/storage/log/hadoop/hdfs/gc.log-201606150820 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms1024m -Xmx1024m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 -server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/mnt/storage/log/hadoop/hdfs/hs_err_pid%p.log -XX:NewSize=128m -XX:MaxNewSize=128m -Xloggc:/mnt/storage/log/hadoop/hdfs/gc.log-201606150820 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms1024m -Xmx1024m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 -server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/mnt/storage/log/hadoop/hdfs/hs_err_pid%p.log -XX:NewSize=128m -XX:MaxNewSize=128m -Xloggc:/mnt/storage/log/hadoop/hdfs/gc.log-201606150820 -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms1024m -Xmx1024m -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT -XX:OnOutOfMemoryError="/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 -Dhadoop.security.logger=INFO,RFAS org.apache.hadoop.hdfs.server.namenode.NameNode
As you can see hadoop.root.logger is set to logstash as supposed to be (even though it is in there twice, no idea why that is the case but I don't care for now). Still there a lot more loggers configured for whichever libs. e.g. security, audit and so on... I tried to modified these to log to logstash manually as well. So at somepoint all loggers had the logstash appender on them. But all in all nothing is being logged to logstash so far, even considering all changes and multiple tries...
What is left?:
1. the state of hadoop logging (at least when using HDP) is kind of a mess. Switching loggers is horribly hard and confusing.
2. Can anyone tell me which of the acutal conf files on the server is actually written when changing the properties in Ambari? is it: /etc/hadoop/conf || /etc/hadoop/2.3.4.0-3485/0 ?
3. Any other ideas why the namenode still does not log anything to logstash?
Created 06-15-2016 06:32 PM
@Thees Gieselmann, here are a few follow-ups. Note that by setting HADOOP_ROOT_LOGGER, you would set this for multiple Hadoop processes, not just the NameNode. I just wanted to make sure that was your goal. The duplication of arguments on the command line is a known bug, which will be fixed in Apache Hadoop 3. You can refer to configs at the path /etc/hadoop/conf. There are symlinks in place there that will redirect to the correct configuration directory based on the currently active HDP version, such as 2.3.4.0-3485. Unfortunately, I don't have any further experience with logstash to address your remaining question. Passing the logger configuration looks right at this point, so perhaps it's time to investigate what logstash itself provides for troubleshooting.
Created 06-16-2016 07:55 AM
Thanks a lot anyways 😉
I will come back and post if I figure out the answer at some point
Created 08-30-2016 01:30 PM
Ok.... I know what was the issue.
The log4j.property files from yarn and hdfs are actually written into the same file on the operating system. Thus I always overwrote my previous logstash appender with the name of the previous one. Giving distinct names to each logstash appender solved the issue. It is weird that these configs are split in Ambari though...