Created 08-22-2023 12:48 AM
We're always experiencing a hdfs-datanode pod restart due to this process. Is there a way to lessen the CPU% or optimize the process?
VM:
Capacity:
cpu: 17
memory: 106231200Ki
top
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 hadoop 20 0 33.0g 1.2g 30076 S 101.7 1.1 35:13.62 java
ps -aux | less
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
hadoop 1 95.9 1.1 34596096 1219024 ? Ssl 06:59 35:42 /etc/alternatives/jre/bin/java -Dproc_datanode -Djava.net.preferIPv4Stack=true -Dhadoop.security.logger=ERROR,RFAS -XX:+UseG1GC -XX:G1HeapRegionSize=32M -XX:+UseGCOverheadLimit -XX:+ExplicitGCInvokesConcurrent -XX:+HeapDumpOnOutOfMemoryError -XX:+ExitOnOutOfMemoryError -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCDateStamps -Xloggc:/opt/hadoop-3.1.1/logs/gc.log -Dcom.sun.management.jmxremote=true -Dcom.sun.management.jmxremote.authenticate=false -Dcom.sun.management.jmxremote.ssl=false -Dcom.sun.management.jmxremote.port=1026 -Dyarn.log.dir=/opt/hadoop-3.1.1/logs -Dyarn.log.file=hadoop.log -Dyarn.home.dir=/opt/hadoop-3.1.1 -Dyarn.root.logger=INFO,console -Djava.library.path=/opt/hadoop-3.1.1/lib/native -Xmx30720m -Dhadoop.log.dir=/opt/hadoop-3.1.1/logs -Dhadoop.log.file=hadoop.log -Dhadoop.home.dir=/opt/hadoop-3.1.1 -Dhadoop.id.str=hadoop -Dhadoop.root.logger=DEBUG,console -Dhadoop.policy.file=hadoop-policy.xml org.apache.hadoop.hdfs.server.datanode.DataNode
Thank you!
Created on 08-23-2023 04:56 AM - edited 08-23-2023 04:57 AM
Hi Noel,
The process you are pointing is the nodemanager see the last parameter of java command that is the starting class
org.apache.hadoop.hdfs.server.datanode.DataNode
I never experienced such problem with this component.
Perhaps you should review the logs from this component and it is retrying something that fails in an infinite loop.
open /var/log/hadoop-hdfs/hadoop-cmf-hdfs-DATANODE-<host-name>.log.out on this server node to check it.
Created 08-24-2023 12:36 AM
Hi @gael__urbauer ,
I've exec to the pod of hdfs-datanode and went to /var/log but we didn't see any hadoop-hdfs folder in it.
$ kubectl exec -it hdfs-datanode-0 bash
bash-4.2$ cd /var/log/
bash-4.2$ ls -l
total 296
-rw------- 1 root utmp 0 Oct 6 2018 btmp
-rw-r--r-- 1 root root 193 Oct 6 2018 grubby_prune_debug
-rw-r--r-- 1 root root 292876 Nov 22 2018 lastlog
-rw------- 1 root root 0 Oct 6 2018 tallylog
-rw-rw-r-- 1 root utmp 0 Oct 6 2018 wtmp
-rw------- 1 root root 4004 Nov 22 2018 yum.log
Created 08-24-2023 01:06 AM
The location depends from the distribution and can be changed in the configuration.
See this article that gives guidelines to find out for YARN nodemanager.
Solved: where are hadoop log files ? - Cloudera Community - 115681
And don't forget to replace YARN by HDFS if needed as you are looking for the data node service of HDFS and not yarn nodemanager.
Created 08-24-2023 01:08 AM
BTW strange to locate datanode in kubernetes as pods are usually used for stateless tasks and a datanode is almost exclusively statefull by nature as it keeps data from HDFS