Support Questions

Find answers, ask questions, and share your expertise

yarn + usercache + folder became with huge size

avatar

hi all

we have hadoop cluster version - 2.6.0.3 with yarn version - 2.7.3

we see that /var in workers ( data node ) machine is full

and the root cause for this is that we see huge folders - lockmgr-b1ed0e9c-5700-4575-aa5e-182146f743d9

under /var/hadoop/yarn/local/usercache/hdfs/appcache/application_1530106922052_0041

please advice how to avoid this isshu , why folder are in that huge capasity ?

[root@worker01 application_1530106922052_0041]# pwd
/var/hadoop/yarn/local/usercache/hdfs/appcache/application_1530106922052_0041
[root@worker01 application_1530106922052_0041]# \ls -ltr
total 8
drwx--x---  2 yarn hadoop    6 Jul  3 06:19 filecache
drwxr-xr-x 66 yarn hadoop 4096 Jul  3 06:24 blockmgr-b1ed0e9c-5700-4575-aa5e-182146f743d9
drwxr-xr-x 65 yarn hadoop 4096 Jul  4 08:02 blockmgr-c6530cea-1e98-419b-8653-3e9b467ac029
[root@worker01 application_1530106922052_0041]# du -sh *
33G     blockmgr-b1ed0e9c-5700-4575-aa5e-182146f743d9
31G     blockmgr-c6530cea-1e98-419b-8653-3e9b467ac029
Michael-Bronson
1 ACCEPTED SOLUTION

avatar
Master Mentor

@Michael Bronson

Please check the yarn configs

Ambari dasboard. --> YARN --> Configs --> Advanced --> Customer yarn-site> Add/find Property


And check for the following properties

yarn.nodemanager.localizer.cache.target-size-mb: This decides the maximum disk space to be used for localizing resources. (At present there is no individual limit for PRIVATE / APPLICATION / PUBLIC cache. YARN-882). Once the total disk size of the cache exceeds this then Deletion service will try to remove files which are not used by any running containers. At present there is no limit (quota) for user cache / public cache / private cache. This limit is applicable to all the disks as a total and is not based on per disk basis.

yarn.nodemanager.localizer.cache.cleanup.interval-ms:
After this interval resource localization service will try to delete the unused resources if total cache size exceeds the configured max-size. Unused resources are those resources which are not referenced by any running container. Every time container requests a resource, container is added into the resources’ reference list. It will remain there until container finishes avoiding accidental deletion of this resource. As a part of container resource cleanup (when container finishes) container will be removed from resources’ reference list. That is why when reference count drops to zero it is an ideal candidate for deletion. The resources will be deleted on LRU basis until current cache size drops below target size.

.

For example please set the value to something like following:

yarn.nodemanager.localizer.cache.target-size-mb = 4GB.  (or desired)
yarn.nodemanager.localizer.cache.cleanup.interval-ms = 300000  (or desired)

Reference: https://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/

View solution in original post

5 REPLIES 5

avatar
Master Mentor

@Michael Bronson

Please check the yarn configs

Ambari dasboard. --> YARN --> Configs --> Advanced --> Customer yarn-site> Add/find Property


And check for the following properties

yarn.nodemanager.localizer.cache.target-size-mb: This decides the maximum disk space to be used for localizing resources. (At present there is no individual limit for PRIVATE / APPLICATION / PUBLIC cache. YARN-882). Once the total disk size of the cache exceeds this then Deletion service will try to remove files which are not used by any running containers. At present there is no limit (quota) for user cache / public cache / private cache. This limit is applicable to all the disks as a total and is not based on per disk basis.

yarn.nodemanager.localizer.cache.cleanup.interval-ms:
After this interval resource localization service will try to delete the unused resources if total cache size exceeds the configured max-size. Unused resources are those resources which are not referenced by any running container. Every time container requests a resource, container is added into the resources’ reference list. It will remain there until container finishes avoiding accidental deletion of this resource. As a part of container resource cleanup (when container finishes) container will be removed from resources’ reference list. That is why when reference count drops to zero it is an ideal candidate for deletion. The resources will be deleted on LRU basis until current cache size drops below target size.

.

For example please set the value to something like following:

yarn.nodemanager.localizer.cache.target-size-mb = 4GB.  (or desired)
yarn.nodemanager.localizer.cache.cleanup.interval-ms = 300000  (or desired)

Reference: https://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/

avatar
Master Mentor

@Michael Bronson

You can add those properties from ambari ui something like following:

Ambari dasboard.--> YARN -->Configs-->Advanced-->Customer yarn-site --> Click on "Add Property" 

Then add the following two properties like: (here 10240 value is in MB means around 10 GB) or for 5 MB it can be set to 5120

yarn.nodemanager.localizer.cache.target-size-mb = 10240
yarn.nodemanager.localizer.cache.cleanup.interval-ms = 300000<br>

.

avatar
New Contributor

What is this block-mgr files contains ?

avatar
New Contributor

What is stored inside this blockmgr-* files ? It has any relation to the input files spark reading ?

avatar

@Jay I not have the variables in YARN - yarn.nodemanager.localizer.cache.target-size-mb , yarn.nodemanager.localizer.cache.cleanup.interval-ms:

so please advice how to add them ? and what is the values that I need to set each of them ?


note - /var is 100G on each worker machine

Michael-Bronson