Created on 12-23-2016 06:14 PM
SYMPTOM: Yarn timeline logs are growing very fast and the disk is now 100% utilized. Below are my configs set for ATS -
Configs:
<property> <name>yarn.timeline-service.ttl-enable</name> <value>true</value> </property> <property> <name>yarn.timeline-service.ttl-ms</name> <value>1339200000</value> /property> <property> <name>yarn.timeline-service.leveldb-timeline-store.ttl-interval-ms</name> <value>150000</value> </property>
ROOT CAUSE: This config does not affect the semantic of the ATS purging process. However, it affects the concrete behavior of a level-db based storage implementation to do purging. This config decides the time interval between two purges in a level-db based ATS storage (like leveldb storage and rolling leveldb storage).
Here in this case, the customer set ttl to 1339200000 ms, 1339200 seconds, or 372 hours or 15.5 days. On a normal cluster with limited disk space budgeted this may cause some problems (13 MB per hour). Reducing this value may help to alleviate the problem.
RESOLUTION: In this case the issue was resolved by modifying the value of the property "yarn.timeline-service.ttl-ms" in the Application Timeline configuration from 1339200000, 15.5 days, to 669600000 or 7 days.
<property> <name>yarn.timeline-service.ttl-ms</name> <value>669600000</value> /property>
User | Count |
---|---|
763 | |
379 | |
316 | |
309 | |
270 |