Support Questions

Find answers, ask questions, and share your expertise

how to rotate the HDFS gc logs on data node machines

avatar

hi all

we are trying to rotate the gc logs on workers machines according to the following

we add the line

-XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2K

inside HADOOP_DATANODE_OPTS variable

in HDFS --> Advanced hadoop-env

as the following:

<br>export HADOOP_DATANODE_OPTS="-server -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/$USER/hs_err_pid%p.log -XX:NewSize=200m -XX:MaxNewSize=200m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2K -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms{{dtnode_heapsize}} -Xmx{{dtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_DATANODE_OPTS}"

but we get on worker machine ( data node under /var/log/hadoop/hdfs ) , the follwing

-rw-r--r--  1 hdfs hadoop   1035 Aug 26 20:33 hadoop-hdfs-datanode-worker01.sys76.com.log.10
-rw-r--r--  1 hdfs hadoop   1035 Aug 26 20:35 hadoop-hdfs-datanode-worker01.sys76.com.log.9
-rw-r--r--  1 hdfs hadoop   1035 Aug 26 20:36 hadoop-hdfs-datanode-worker01.sys76.com.log.8
-rw-r--r--  1 hdfs hadoop   1504 Aug 26 20:37 hadoop-hdfs-datanode-worker01.sys76.com.log.7
-rw-r--r--  1 hdfs hadoop   1035 Aug 26 20:38 hadoop-hdfs-datanode-worker01.sys76.com.log.6
-rw-r--r--  1 hdfs hadoop   1035 Aug 26 20:40 hadoop-hdfs-datanode-worker01.sys76.com.log.5
-rw-r--r--  1 hdfs hadoop   1035 Aug 26 20:41 hadoop-hdfs-datanode-worker01.sys76.com.log.4
-rw-r--r--  1 hdfs hadoop   1035 Aug 26 20:43 hadoop-hdfs-datanode-worker01.sys76.com.log.3
-rw-r--r--  1 hdfs hadoop   1035 Aug 26 20:44 hadoop-hdfs-datanode-worker01.sys76.com.log.2
-rw-r--r--  1 hdfs hadoop   1035 Aug 26 20:46 hadoop-hdfs-datanode-worker01.sys76.com.log.1
-rw-r--r--  1 hdfs hadoop    727 Aug 26 20:46 hadoop-hdfs-datanode-worker01.sys76.com.out.4
-rw-r--r--  1 hdfs hadoop   3905 Aug 26 20:46 gc.log-201808262031.0.current
-rw-r--r--  1 hdfs hadoop    727 Aug 26 20:46 hadoop-hdfs-datanode-worker01.sys76.com.out.3
-rw-r--r--. 1 hdfs hadoop 101346 Aug 26 21:53 SecurityAuth.audit
-rw-r--r--  1 hdfs hadoop   7488 Aug 27 06:25 gc.log-201808262046
-rw-r--r--  1 hdfs hadoop    727 Aug 27 06:28 hadoop-hdfs-datanode-worker01.sys76.com.out.2
-rw-r--r--  1 hdfs hadoop   3651 Aug 27 06:28 gc.log-201808270625.0.current
-rw-r--r--  1 hdfs hadoop    727 Aug 27 06:29 hadoop-hdfs-datanode-worker01.sys76.com.out.1
-rw-r--r--  1 hdfs hadoop    727 Aug 27 06:29 hadoop-hdfs-datanode-worker01.sys76.com.out
-rw-r--r--  1 hdfs hadoop   2940 Aug 27 06:29 gc.log-201808270629.0.current
-rw-r--r--  1 hdfs hadoop 378069 Aug 27 06:34 hadoop-hdfs-datanode-worker01.sys76.com.log

why gc.log.... are not rotated and comes with "0.current"

what is wrong with my configuration?

** we also try the following but this isn't worked

In hadoop-env template wherever you find -XX:+PrintGCDetails parameter you can add -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M

For example:

From:

HADOOP_JOBTRACKER_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{jtnode_opt_newsize}} -XX:MaxNewSize={{jtnode_opt_maxnewsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xmx{{jtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dmapred.audit.logger=INFO,MRAUDIT -Dhadoop.mapreduce.jobsummary.logger=INFO,JSA ${HADOOP_JOBTRACKER_OPTS}"

To:

HADOOP_JOBTRACKER_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{jtnode_opt_newsize}} -XX:MaxNewSize={{jtnode_opt_maxnewsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails <strong>-XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=2M </strong>-XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xmx{{jtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dmapred.audit.logger=INFO,MRAUDIT -Dhadoop.mapreduce.jobsummary.logger=INFO,JSA ${HADOOP_JOBTRACKER_OPTS}"
Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

If you just want to quickly fill the GC log to see the log rotation then try this instead of using "cat" comamand to add content inside the actual current GC log file.

# su - hdfs
# /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
.
# SYNTAX: # $JAVA_HOME/bin/jcmd $DATANODE_PID GC.run

Use the "jcmd" utility to explicitly trigger the GC many times. So after running this command around 20-30 times you will see that the GC log of data node gets rotated and in the current GC log you will see some messages like:

2018-08-27 08:11:07 GC log file has reached the maximum size. Saved as /var/log/hadoop/hdfs/gc.log-201808270803.0

.

I see the log rotations works as expected as following:

-rw-r--r--. 1 hdfs hadoop     10617 Aug 27 08:12 gc.log-201808270803.1
-rw-r--r--. 1 hdfs hadoop     10615 Aug 27 08:13 gc.log-201808270803.2
-rw-r--r--. 1 hdfs hadoop     10616 Aug 27 08:13 gc.log-201808270803.3
-rw-r--r--. 1 hdfs hadoop      4001 Aug 27 08:13 gc.log-201808270803.4.current

.

View solution in original post

20 REPLIES 20

avatar

@Jay

I do the following in order to test it ( we append another log to gc.log-201808270656.0.current

[root@worker01 hdfs]# cat  gc.log-201808262046 >> gc.log-201808270656.0.current
[root@worker01 hdfs]# du -sh * | grep  gc.log-201808270656.0.current
12K     gc.log-201808270656.0.current

as you see now the log is 12K , and still not rotated

Michael-Bronson

avatar

@Jay , I see that you add the "-verbose:gc"

do we need also add this?

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

If you just want to quickly fill the GC log to see the log rotation then try this instead of using "cat" comamand to add content inside the actual current GC log file.

# su - hdfs
# /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
.
# SYNTAX: # $JAVA_HOME/bin/jcmd $DATANODE_PID GC.run

Use the "jcmd" utility to explicitly trigger the GC many times. So after running this command around 20-30 times you will see that the GC log of data node gets rotated and in the current GC log you will see some messages like:

2018-08-27 08:11:07 GC log file has reached the maximum size. Saved as /var/log/hadoop/hdfs/gc.log-201808270803.0

.

I see the log rotations works as expected as following:

-rw-r--r--. 1 hdfs hadoop     10617 Aug 27 08:12 gc.log-201808270803.1
-rw-r--r--. 1 hdfs hadoop     10615 Aug 27 08:13 gc.log-201808270803.2
-rw-r--r--. 1 hdfs hadoop     10616 Aug 27 08:13 gc.log-201808270803.3
-rw-r--r--. 1 hdfs hadoop      4001 Aug 27 08:13 gc.log-201808270803.4.current

.

avatar

@Jay

I do the following

[root@worker01 hdfs]# su - hdfs
Last login: Mon Aug 27 08:19:56 UTC 2018 on pts/1
[hdfs@worker01 ~]$  /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
118714:
Command executed successfully
[hdfs@worker01 ~]$  /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
118714:
Command executed successfully
[hdfs@worker01 ~]$  /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
118714:
Command executed successfully
[hdfs@worker01 ~]$  /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
118714:
Command executed successfully
[hdfs@worker01 ~]$  /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
118714:
Command executed successfully
[hdfs@worker01 ~]$  /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
118714:
Command executed successfully
[hdfs@worker01 ~]$  /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
118714:
Command executed successfully
[hdfs@worker01 ~]$  /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
118714:
Command executed successfully
[hdfs@worker01 ~]$  /usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run
118714:
Command executed successfully


but still even if gc log is now 12K , its not rotated

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

You will need to make sure that you have not edited the GC log file manually. Like you did earlier. it will cause JVM to loose the logging indexing on the GC log file. So if by any chance you used the following command then it wont work

# cat  gc.log-201808262046 >> gc.log-201808270656.0.current

You need a GC log file (Uninterrupted) means you have not touched the file.

.

Please try restarting this DataNode after deleting the manually edited gc log file. and then keep running the following command for 20-30 times.

/usr/jdk64/jdk1.8.0_112/bin/jcmd `cat /var/run/hadoop/hdfs/hadoop-hdfs-datanode.pid` GC.run

.

avatar

@Jay I attached the log4j

Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Please find my working log4j template content. I just added the following two additional parameters in my DataNode JVm option because i wanted the GC log to get filled up more quickly. -XX:+PrintTenuringDistribution -XX:+PrintAdaptiveSizePolicy

      {% else %}
      SHARED_HADOOP_NAMENODE_OPTS="-server -XX:ParallelGCThreads=8 -XX:+UseConcMarkSweepGC -XX:ErrorFile={{hdfs_log_dir_prefix}}/$USER/hs_err_pid%p.log -XX:NewSize={{namenode_opt_newsize}} -XX:MaxNewSize={{namenode_opt_maxnewsize}} -Xloggc:{{hdfs_log_dir_prefix}}/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -Xms{{namenode_heapsize}} -Xmx{{namenode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT"
      export HADOOP_NAMENODE_OPTS="${SHARED_HADOOP_NAMENODE_OPTS} -XX:OnOutOfMemoryError=\"/usr/hdp/current/hadoop-hdfs-namenode/bin/kill-name-node\" -Dorg.mortbay.jetty.Request.maxFormContentSize=-1 ${HADOOP_NAMENODE_OPTS}"
      export HADOOP_DATANODE_OPTS="-server -XX:ParallelGCThreads=4 -XX:+UseConcMarkSweepGC -XX:ErrorFile=/var/log/hadoop/$USER/hs_err_pid%p.log -XX:NewSize=200m -XX:MaxNewSize=200m -Xloggc:/var/log/hadoop/$USER/gc.log-`date +'%Y%m%d%H%M'` -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xms{{dtnode_heapsize}} -Xmx{{dtnode_heapsize}} -Dhadoop.security.logger=INFO,DRFAS -Dhdfs.audit.logger=INFO,DRFAAUDIT ${HADOOP_DATANODE_OPTS} -XX:CMSInitiatingOccupancyFraction=70 -XX:+UseCMSInitiatingOccupancyOnly -XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10K -XX:+PrintTenuringDistribution -XX:+PrintAdaptiveSizePolicy"

test-log4j.txt

Notice in my working DataNode opts i have the following settings

-XX:+PrintGCTimeStamps -XX:+UseGCLogFileRotation -XX:NumberOfGCLogFiles=5 -XX:GCLogFileSize=10K -XX:+PrintTenuringDistribution -XX:+PrintAdaptiveSizePolicy

.

avatar

@jay ,

now its working thank to your advice

-rw-r--r--  1 hdfs hadoop   10572 Aug 27 09:18 gc.log-201808270913.0
-rw-r--r--  1 hdfs hadoop   10427 Aug 27 09:19 gc.log-201808270913.1
-rw-r--r--  1 hdfs hadoop   10427 Aug 27 09:20 gc.log-201808270913.2
-rw-r--r--. 1 hdfs hadoop  109326 Aug 27 09:20 SecurityAuth.audit
-rw-r--r--  1 hdfs hadoop 1131260 Aug 27 09:20 hadoop-hdfs-datanode-worker01.sys73.com.log
-rw-r--r--  1 hdfs hadoop    9765 Aug 27 09:20 gc.log-201808270913.3.current
Michael-Bronson

avatar
Master Mentor

@Michael Bronson

Wonderful !!!!!

avatar

@Jay , can I do the same for HADOOP_NAMENODE_OPTS variable ?

Michael-Bronson