- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
log-dirs are bad: /var/log/hadoop-yarn/container
- Labels:
-
Apache YARN
Created on ‎10-29-2017 03:38 AM - edited ‎09-16-2022 05:27 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Guys,
I'm getting from time to time that some NodeManagers got lost in Yarn as a result of log-dirs are bad: /var/log/hadoop-yarn/container.
Looking at the disk space and don't see any issue there, at the Resource manager i see:
INFO org.apache.hadoop.yarn.server.resourcemanager.amlauncher.AMLauncher: Done launching container Container: [ContainerId: container_e37_1509251204123_1378_01_000001, NodeId: avpr-dhc001.lpdomain.com:8041, NodeHttpAddress: avpr-dhc001.lpdomain.com:8042, Resource: <memory:2048, vCores:1>, Priority: 0, Token: Token { kind: ContainerToken, service: 172.16.144.140:8041 }, ] for AM appattempt_1509251204123_1378_000001
2017-10-29 05:08:22,593 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: Node avpr-dhc001.lpdomain.com:8041 reported UNHEALTHY with details: 1/1 log-dirs are bad: /liveperson/hadoop/log/hadoop-yarn/container
2017-10-29 05:08:22,593 INFO org.apache.hadoop.yarn.server.resourcemanager.rmnode.RMNodeImpl: avpr-dhc001.lpdomain.com:8041 Node Transitioned from RUNNING to UNHEALTHY
I don't see any issue in the DataNode or NodeManager logs.
No inode issue in the server.
Created ‎10-29-2017 04:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem was the limitation of sub directory under specific dir
so when checking the folder container i see there is 32,000 directories which is the limit.
looking why the retention isnot deleting these files and i have the following conf:
