- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
yarn + usercache + folder became with huge size
- Labels:
-
Apache Hadoop
-
Apache YARN
Created ‎07-04-2018 09:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
hi all
we have hadoop cluster version - 2.6.0.3 with yarn version - 2.7.3
we see that /var in workers ( data node ) machine is full
and the root cause for this is that we see huge folders - lockmgr-b1ed0e9c-5700-4575-aa5e-182146f743d9
under /var/hadoop/yarn/local/usercache/hdfs/appcache/application_1530106922052_0041
please advice how to avoid this isshu , why folder are in that huge capasity ?
[root@worker01 application_1530106922052_0041]# pwd /var/hadoop/yarn/local/usercache/hdfs/appcache/application_1530106922052_0041 [root@worker01 application_1530106922052_0041]# \ls -ltr total 8 drwx--x--- 2 yarn hadoop 6 Jul 3 06:19 filecache drwxr-xr-x 66 yarn hadoop 4096 Jul 3 06:24 blockmgr-b1ed0e9c-5700-4575-aa5e-182146f743d9 drwxr-xr-x 65 yarn hadoop 4096 Jul 4 08:02 blockmgr-c6530cea-1e98-419b-8653-3e9b467ac029 [root@worker01 application_1530106922052_0041]# du -sh * 33G blockmgr-b1ed0e9c-5700-4575-aa5e-182146f743d9 31G blockmgr-c6530cea-1e98-419b-8653-3e9b467ac029
Created ‎07-04-2018 09:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please check the yarn configs
Ambari dasboard. --> YARN --> Configs --> Advanced --> Customer yarn-site> Add/find Property
And check for the following properties
yarn.nodemanager.localizer.cache.target-size-mb: This decides the maximum disk space to be used for localizing resources. (At present there is no individual limit for PRIVATE / APPLICATION / PUBLIC cache. YARN-882). Once the total disk size of the cache exceeds this then Deletion service will try to remove files which are not used by any running containers. At present there is no limit (quota) for user cache / public cache / private cache. This limit is applicable to all the disks as a total and is not based on per disk basis.
yarn.nodemanager.localizer.cache.cleanup.interval-ms: After this interval resource localization service will try to delete the unused resources if total cache size exceeds the configured max-size. Unused resources are those resources which are not referenced by any running container. Every time container requests a resource, container is added into the resources’ reference list. It will remain there until container finishes avoiding accidental deletion of this resource. As a part of container resource cleanup (when container finishes) container will be removed from resources’ reference list. That is why when reference count drops to zero it is an ideal candidate for deletion. The resources will be deleted on LRU basis until current cache size drops below target size.
.
For example please set the value to something like following:
yarn.nodemanager.localizer.cache.target-size-mb = 4GB. (or desired) yarn.nodemanager.localizer.cache.cleanup.interval-ms = 300000 (or desired)
Reference: https://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/
Created ‎07-04-2018 09:06 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Please check the yarn configs
Ambari dasboard. --> YARN --> Configs --> Advanced --> Customer yarn-site> Add/find Property
And check for the following properties
yarn.nodemanager.localizer.cache.target-size-mb: This decides the maximum disk space to be used for localizing resources. (At present there is no individual limit for PRIVATE / APPLICATION / PUBLIC cache. YARN-882). Once the total disk size of the cache exceeds this then Deletion service will try to remove files which are not used by any running containers. At present there is no limit (quota) for user cache / public cache / private cache. This limit is applicable to all the disks as a total and is not based on per disk basis.
yarn.nodemanager.localizer.cache.cleanup.interval-ms: After this interval resource localization service will try to delete the unused resources if total cache size exceeds the configured max-size. Unused resources are those resources which are not referenced by any running container. Every time container requests a resource, container is added into the resources’ reference list. It will remain there until container finishes avoiding accidental deletion of this resource. As a part of container resource cleanup (when container finishes) container will be removed from resources’ reference list. That is why when reference count drops to zero it is an ideal candidate for deletion. The resources will be deleted on LRU basis until current cache size drops below target size.
.
For example please set the value to something like following:
yarn.nodemanager.localizer.cache.target-size-mb = 4GB. (or desired) yarn.nodemanager.localizer.cache.cleanup.interval-ms = 300000 (or desired)
Reference: https://hortonworks.com/blog/resource-localization-in-yarn-deep-dive/
Created ‎07-04-2018 09:25 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
You can add those properties from ambari ui something like following:
Ambari dasboard.--> YARN -->Configs-->Advanced-->Customer yarn-site --> Click on "Add Property"
Then add the following two properties like: (here 10240 value is in MB means around 10 GB) or for 5 MB it can be set to 5120
yarn.nodemanager.localizer.cache.target-size-mb = 10240 yarn.nodemanager.localizer.cache.cleanup.interval-ms = 300000<br>
Created ‎03-28-2022 03:02 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is this block-mgr files contains ?
Created ‎03-28-2022 03:15 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
What is stored inside this blockmgr-* files ? It has any relation to the input files spark reading ?
Created ‎07-04-2018 09:11 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Jay I not have the variables in YARN - yarn.nodemanager.localizer.cache.target-size-mb , yarn.nodemanager.localizer.cache.cleanup.interval-ms:
so please advice how to add them ? and what is the values that I need to set each of them ?
note - /var is 100G on each worker machine
