Created 07-04-2018 09:01 AM
hi all
we have hadoop cluster version - 2.6.0.3 with yarn version - 2.7.3
we see that /var in workers ( data node ) machine is full
and the root cause for this is that we see huge folders - lockmgr-b1ed0e9c-5700-4575-aa5e-182146f743d9
under /var/hadoop/yarn/local/usercache/hdfs/appcache/application_1530106922052_0041
please advice how to avoid this isshu , why folder are in that huge capasity ?
[root@worker01 application_1530106922052_0041]# pwd /var/hadoop/yarn/local/usercache/hdfs/appcache/application_1530106922052_0041 [root@worker01 application_1530106922052_0041]# \ls -ltr total 8 drwx--x--- 2 yarn hadoop 6 Jul 3 06:19 filecache drwxr-xr-x 66 yarn hadoop 4096 Jul 3 06:24 blockmgr-b1ed0e9c-5700-4575-aa5e-182146f743d9 drwxr-xr-x 65 yarn hadoop 4096 Jul 4 08:02 blockmgr-c6530cea-1e98-419b-8653-3e9b467ac029 [root@worker01 application_1530106922052_0041]# du -sh * 33G blockmgr-b1ed0e9c-5700-4575-aa5e-182146f743d9 31G blockmgr-c6530cea-1e98-419b-8653-3e9b467ac029
Created 07-04-2018 09:06 AM
Please check the yarn configs
Ambari dasboard. --> YARN --> Configs --> Advanced --> Customer yarn-site> Add/find Property
And check for the following properties
yarn.nodemanager.localizer.cache.target-size-mb: This decides the maximum disk space to be used for localizing resources. (At present there is no individual limit for PRIVATE / APPLICATION / PUBLIC cache. YARN-882). Once the total disk size of the cache exceeds this then Deletion service will try to remove files which are not used by any running containers. At present there is no limit (quota) for user cache / public cache / private cache. This limit is applicable to all the disks as a total and is not based on per disk basis.
yarn.nodemanager.localizer.cache.cleanup.interval-ms: After this interval resource localization service will try to delete the unused resources if total cache size exceeds the configured max-size. Unused resources are those resources which are not referenced by any running container. Every time container requests a resource, container is added into the resources’ reference list. It will remain there until container finishes avoiding accidental deletion of this resource. As a part of container resource cleanup (when container finishes) container will be removed from resources’ reference list. That is why when reference count drops to zero it is an ideal candidate for deletion. The resources will be deleted on LRU basis until current cache size drops below target size.
.
For example please set the value to something like following:
yarn.nodemanager.localizer.cache.target-size-mb = 4GB. (or desired) yarn.nodemanager.localizer.cache.cleanup.interval-ms = 300000 (or desired)
Reference: https://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/
Created 07-04-2018 09:06 AM
Please check the yarn configs
Ambari dasboard. --> YARN --> Configs --> Advanced --> Customer yarn-site> Add/find Property
And check for the following properties
yarn.nodemanager.localizer.cache.target-size-mb: This decides the maximum disk space to be used for localizing resources. (At present there is no individual limit for PRIVATE / APPLICATION / PUBLIC cache. YARN-882). Once the total disk size of the cache exceeds this then Deletion service will try to remove files which are not used by any running containers. At present there is no limit (quota) for user cache / public cache / private cache. This limit is applicable to all the disks as a total and is not based on per disk basis.
yarn.nodemanager.localizer.cache.cleanup.interval-ms: After this interval resource localization service will try to delete the unused resources if total cache size exceeds the configured max-size. Unused resources are those resources which are not referenced by any running container. Every time container requests a resource, container is added into the resources’ reference list. It will remain there until container finishes avoiding accidental deletion of this resource. As a part of container resource cleanup (when container finishes) container will be removed from resources’ reference list. That is why when reference count drops to zero it is an ideal candidate for deletion. The resources will be deleted on LRU basis until current cache size drops below target size.
.
For example please set the value to something like following:
yarn.nodemanager.localizer.cache.target-size-mb = 4GB. (or desired) yarn.nodemanager.localizer.cache.cleanup.interval-ms = 300000 (or desired)
Reference: https://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/
Created 07-04-2018 09:25 AM
You can add those properties from ambari ui something like following:
Ambari dasboard.--> YARN -->Configs-->Advanced-->Customer yarn-site --> Click on "Add Property"
Then add the following two properties like: (here 10240 value is in MB means around 10 GB) or for 5 MB it can be set to 5120
yarn.nodemanager.localizer.cache.target-size-mb = 10240 yarn.nodemanager.localizer.cache.cleanup.interval-ms = 300000<br>
Created 03-28-2022 03:02 AM
What is this block-mgr files contains ?
Created 03-28-2022 03:15 AM
What is stored inside this blockmgr-* files ? It has any relation to the input files spark reading ?
Created 07-04-2018 09:11 AM
@Jay I not have the variables in YARN - yarn.nodemanager.localizer.cache.target-size-mb , yarn.nodemanager.localizer.cache.cleanup.interval-ms:
so please advice how to add them ? and what is the values that I need to set each of them ?
note - /var is 100G on each worker machine