Few Days ago I accidentally deleted the some content of the /var/lib/cloudera-service-monitor/ts directory where monitor data are stored. Since then I'm not able to restart service monitor because different exceptions are generated. This is the last one
Failed to start Firehose
java.lang.RuntimeException: com.cloudera.cmon.tstore.leveldb.LDBPartitionManager$LDBPartitionException: Unable to open DB in directory /var/lib/cloudera-service-monitor/ts/stream/partitions/stream_2015-07-10T07:22:29.111Z for partition LDBPartitionMetadataWrapper{tableName=stream, partitionName=stream_2015-07-10T07:22:29.111Z, startTime=2015-07-10T07:22:29.111Z, endTime=null, version=2, state=CLOSED}
at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.getPartition(LDBPartitionManager.java:722)
at com.cloudera.cmon.tstore.leveldb.LDBPartitionUtils.forPartition(LDBPartitionUtils.java:70)
at com.cloudera.cmon.tstore.leveldb.LDBPartitionUtils.writeForPartition(LDBPartitionUtils.java:45)
at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesStreamTable.write(LDBTimeSeriesStreamTable.java:118)
at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesStreamTable.write(LDBTimeSeriesStreamTable.java:107)
at com.cloudera.cmon.tstore.leveldb.LDBTimeSeriesStore.write(LDBTimeSeriesStore.java:236)
at com.cloudera.cmon.tstore.AggregatingTimeSeriesStore.write(AggregatingTimeSeriesStore.java:219)
at com.cloudera.cmon.kaiser.TimeSeriesHelper.insertInternalMetrics(TimeSeriesHelper.java:194)
at com.cloudera.cmon.firehose.Firehose.insertStartupMetrics(Firehose.java:518)
at com.cloudera.cmon.firehose.Firehose.<init>(Firehose.java:310)
at com.cloudera.cmon.firehose.Main.main(Main.java:527)
Caused by: com.cloudera.cmon.tstore.leveldb.LDBPartitionManager$LDBPartitionException: Unable to open DB in directory /var/lib/cloudera-service-monitor/ts/stream/partitions/stream_2015-07-10T07:22:29.111Z for partition LDBPartitionMetadataWrapper{tableName=stream, partitionName=stream_2015-07-10T07:22:29.111Z, startTime=2015-07-10T07:22:29.111Z, endTime=null, version=2, state=CLOSED}
at com.cloudera.cmon.tstore.leveldb.LDBUtils.openOrCreatePartitionDB(LDBUtils.java:195)
at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.getOrOpenInternal(LDBPartitionManager.java:616)
at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.openOrCreatePartitionLDB(LDBPartitionManager.java:557)
at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.getPartition(LDBPartitionManager.java:451)
at com.cloudera.cmon.tstore.leveldb.LDBPartitionManager.getPartition(LDBPartitionManager.java:713)
... 10 more
Caused by: org.fusesource.leveldbjni.internal.NativeDB$DBException: Invalid argument: /var/lib/cloudera-service-monitor/ts/stream/partitions/stream_2015-07-10T07:22:29.111Z: does not exist (create_if_missing is false)
at org.fusesource.leveldbjni.internal.NativeDB.checkStatus(NativeDB.java:194)
at org.fusesource.leveldbjni.internal.NativeDB.open(NativeDB.java:212)
at org.fusesource.leveldbjni.JniDBFactory.open(JniDBFactory.java:168)
at com.cloudera.cmon.tstore.leveldb.LDBUtils.openOrCreatePartitionDB(LDBUtils.java:185)
... 14 more
When I check on filesystem the directory exist but I'm not able to solve the problem.
Can anyone help me
Thanks in advance
Created 07-16-2015 09:51 AM
This could be either permissions issue under /var/lib/cloudera-service-monitor or corrupted LevelDB data.
Workaround, if you don't intend to scroll back to past service events, and would like to start SMON you can re-initilaise SMON LevelDB location.
1. Stop Service Monitor
2. [bash]$ mv /var/lib/cloudera-service-monitor /var/lib/cloudera-service-monitor.moved
3. Start SMON, this will initialise your Service Monitor LevelDB/ts data
Awaiting your feedback if this helps.
Michalis
Created 07-16-2015 09:51 AM
This could be either permissions issue under /var/lib/cloudera-service-monitor or corrupted LevelDB data.
Workaround, if you don't intend to scroll back to past service events, and would like to start SMON you can re-initilaise SMON LevelDB location.
1. Stop Service Monitor
2. [bash]$ mv /var/lib/cloudera-service-monitor /var/lib/cloudera-service-monitor.moved
3. Start SMON, this will initialise your Service Monitor LevelDB/ts data
Awaiting your feedback if this helps.
Michalis
Created 07-25-2015 05:45 AM
Yeah! It works thank you very much
Created 04-21-2016 11:57 AM
Hi, just wanted to add that I had a similar problem with the service monitor and after moving the old directory it started. The only significant thing I have to add is that I did not see any "error" labels in the start up error log file.
Thanks!!
Created on 12-01-2017 10:39 AM - edited 12-01-2017 10:53 AM
Change the directories below for Service Monitor since the procedure is the same as for the Host Monitor.
You can salvage the contents of the Host Monitor by using the LDBStoreTool Java Class to repair the corrupted LDB:
java -cp "/usr/share/cmf/lib/*" com.cloudera.cmon.tstore.leveldb.tool.LDBStoreTool repair --directory /var/lib/cloudera-host-monitor/subject_record/subject_ts/partitions/subject_ts_2017-10-30T18:03:04.415Z [ main] log INFO Logging to org.slf4j.impl.Log4jLoggerAdapter(org.mortbay.log) via org.mortbay.log.Slf4jLog [ main] CMONConfiguration INFO Config: jar:file:/usr/share/cmf/common_jars/firehose-5.12.1.jar!/cmon.conf [ main] ConfigUtil WARN Could not find configuration file cmon-cm-auth.conf [ main] LDBResourceManager INFO Max file descriptors: 4096 [ main] LDBResourceManager INFO Setting maximum open fds to: 2048 Running repair command Success
If the LDBStoreTool Java class is unable to repair the corrupt LDB then you will have to purge the /var/lib/cloudera-host-monitor directory similar to steps noted above by Michalis.