Support Questions
Find answers, ask questions, and share your expertise

App timeline server not starting

Explorer

Hello guys,

App timeline server is not starting up. Whenever I try to bring it up,it stops within a minute.

Here are the logs from /var/log/hadoop-yarn/yarn/yarn-yarn-timelineserver-node2.log

2018-05-21 15:57:56,292 INFO timeline.RollingLevelDB (RollingLevelDB.java:initRollingLevelDB(258)) - Initializing rolling leveldb instance :file:/hadoop/yarn/timeline/leveldb-timeline-store/indexes-ldb.2017-08-22-13 for start time: 1503406800000 2018-05-21 15:57:56,409 INFO timeline.RollingLevelDB (RollingLevelDB.java:initRollingLevelDB(266)) - Added rolling leveldb instance 2017-08-22-13 to indexes-ldb 2018-05-21 15:57:56,717 INFO timeline.RollingLevelDBTimelineStore (RollingLevelDBTimelineStore.java:checkVersion(1581)) - Loaded timeline store version info 1.0 2018-05-21 15:57:56,720 INFO timeline.EntityGroupFSTimelineStore (EntityGroupFSTimelineStore.java:serviceInit(157)) - Cleaner set to delete logs older than 604800 seconds 2018-05-21 15:57:56,720 INFO timeline.EntityGroupFSTimelineStore (EntityGroupFSTimelineStore.java:serviceInit(164)) - Unknown apps will be treated as complete after 86400 seconds 2018-05-21 15:57:56,720 INFO timeline.EntityGroupFSTimelineStore (EntityGroupFSTimelineStore.java:serviceInit(170)) - Application cache size is 10 2018-05-21 15:57:56,755 FATAL applicationhistoryservice.ApplicationHistoryServer (ApplicationHistoryServer.java:launchAppHistoryServer(171)) - Error starting ApplicationHistoryServer java.lang.InternalError: java.io.FileNotFoundException: /usr/jdk64/jdk1.8.0_60/jre/lib/ext/sunec.jar (Too many open files) at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1003) at sun.misc.URLClassPath.getResource(URLClassPath.java:212) at java.net.URLClassLoader$1.run(URLClassLoader.java:365) at java.net.URLClassLoader$1.run(URLClassLoader.java:362) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:361) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at java.lang.ClassLoader.loadClass(ClassLoader.java:411) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at sun.security.jca.ProviderConfig$2.run(ProviderConfig.java:215) at sun.security.jca.ProviderConfig$2.run(ProviderConfig.java:206) at java.security.AccessController.doPrivileged(Native Method) at sun.security.jca.ProviderConfig.doLoadProvider(ProviderConfig.java:206) at sun.security.jca.ProviderConfig.getProvider(ProviderConfig.java:187) at sun.security.jca.ProviderList.getProvider(ProviderList.java:233) at sun.security.jca.ProviderList.getService(ProviderList.java:331) at sun.security.jca.GetInstance.getInstance(GetInstance.java:157) at javax.net.ssl.KeyManagerFactory.getInstance(KeyManagerFactory.java:137) at org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory.init(FileBasedKeyStoresFactory.java:179) at org.apache.hadoop.security.ssl.SSLFactory.init(SSLFactory.java:131) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.newSslConnConfigurator(TimelineClientImpl.java:656) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.newConnConfigurator(TimelineClientImpl.java:631) at org.apache.hadoop.yarn.client.api.impl.TimelineClientImpl.serviceInit(TimelineClientImpl.java:330) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.client.api.impl.YarnClientImpl.serviceInit(YarnClientImpl.java:170) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.createAndInitYarnClient(EntityGroupFSTimelineStore.java:454) at org.apache.hadoop.yarn.server.timeline.EntityGroupFSTimelineStore.serviceInit(EntityGroupFSTimelineStore.java:173) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.service.CompositeService.serviceInit(CompositeService.java:107) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.serviceInit(ApplicationHistoryServer.java:104) at org.apache.hadoop.service.AbstractService.init(AbstractService.java:163) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.launchAppHistoryServer(ApplicationHistoryServer.java:168) at org.apache.hadoop.yarn.server.applicationhistoryservice.ApplicationHistoryServer.main(ApplicationHistoryServer.java:178) Caused by: java.io.FileNotFoundException: /usr/jdk64/jdk1.8.0_60/jre/lib/ext/sunec.jar (Too many open files) at java.util.zip.ZipFile.open(Native Method) at java.util.zip.ZipFile.<init>(ZipFile.java:219) at java.util.zip.ZipFile.<init>(ZipFile.java:149) at java.util.jar.JarFile.<init>(JarFile.java:166) at java.util.jar.JarFile.<init>(JarFile.java:103) at sun.misc.URLClassPath$JarLoader.getJarFile(URLClassPath.java:893) at sun.misc.URLClassPath$JarLoader.access$700(URLClassPath.java:756) at sun.misc.URLClassPath$JarLoader$1.run(URLClassPath.java:838) at sun.misc.URLClassPath$JarLoader$1.run(URLClassPath.java:831) at java.security.AccessController.doPrivileged(Native Method) at sun.misc.URLClassPath$JarLoader.ensureOpen(URLClassPath.java:830) at sun.misc.URLClassPath$JarLoader.getResource(URLClassPath.java:1001) ... 34 more 2018-05-21 15:57:56,759 INFO util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status -1 2018-05-21 15:57:56,761 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(211)) - Stopping ApplicationHistoryServer metrics system... 2018-05-21 15:57:56,762 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:stop(217)) - ApplicationHistoryServer metrics system stopped. 2018-05-21 15:57:56,762 INFO impl.MetricsSystemImpl (MetricsSystemImpl.java:shutdown(605)) - ApplicationHistoryServer metrics system shutdown complete. 2018-05-21 15:57:56,762 INFO timeline.EntityGroupFSTimelineStore (EntityGroupFSTimelineStore.java:serviceStop(297)) - Stopping EntityGroupFSTimelineStore 2018-05-21 15:57:56,805 INFO applicationhistoryservice.ApplicationHistoryServer (LogAdapter.java:info(45)) - SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down ApplicationHistoryServer at node2/IP

As per the exception, I tried increasing the ulimit for the user. Even after the value was tripled, I'm getting the same error.

Any help will be appreciated.

5 REPLIES 5

Super Mentor

@Anand k

As you are getting the following error "Too many open files"

(ApplicationHistoryServer.java:launchAppHistoryServer(171)) - Error starting ApplicationHistoryServer java.lang.InternalError: java.io.FileNotFoundException: 
/usr/jdk64/jdk1.8.0_60/jre/lib/ext/sunec.jar (Too many open files)

It indicates that you might not have set the File Descriptor properties to higher value. Please check the file "/etc/security/limits.conf" to see if you have set system-wide ulimits properly or not?

https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.5/bk_security/content/kerb-config-limits.html

Example:

# ulimit -a
# ulimit -n 32768
# ulimit -n

Also please check what is the value set in this file:
# cat /etc/security/limits.d/yarn.conf

.

Also please check what is the value for the following property:

yarn.timeline-service.leveldb-timeline-store.max-open-files

.

Explorer

Hello @Jay Kumar SenSharma any updates on this issue?

Super Mentor

@Anand k

If you are still facing the same Too many Open files issue then please check the number of open file descriptions might not be set properly.

Please share the output which we requested in our previous update.

# ulimit -a
# ulimit -n 32768
# ulimit -n

Also please check what is the value setinthis file:
# cat /etc/security/limits.d/yarn.conf

.

Also please share the output of the following command:

# lsof -p $APP_TIMELINE_PID  | wc -l
# lsof -p $APP_TIMELINE_PID  

.

Explorer

@Jay Kumar SenSharma

# ulimit -a
core file size          (blocks, -c) 0
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 63522
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 32768
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) 8192
cpu time               (seconds, -t) unlimited
max user processes              (-u) 63522
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited<br>

# ulimit -n
32768

# cat /etc/security/limits.d/yarn.conf
yarn   - nofile 65536
yarn   - nproc  65536

# lsof -p $APP_TIMELINE_PID 
lsof: no process ID specified

#lsof -i TCP:10200
blank

Explorer

hi @Jay Kumar SenSharma

I did try setting higher values for open file limit in the mentioned files. Still, the server is crashing.