Created 08-03-2018 05:40 AM
2018-08-03 10:49:47,087 INFO [RS_LOG_REPLAY_OPS-CHMCISPRBDDN01:16020-1] coordination.ZkSplitLogWorkerCoordination: successfully transitioned task /hbase-unsecure/splitWAL/WALs%2Fchmcisprbddn08.chm.intra%2C16020%2C1503542190128-splitting%2Fchmcisprbddn08.chm.intra%252C16020%252C1503542190128.default.1503542196998 to final state ERR chmcisprbddn01.chm.intra,16020,1533218035807
2018-08-03 10:49:47,088 INFO [RS_LOG_REPLAY_OPS-CHMCISPRBDDN01:16020-1] handler.WALSplitterHandler: worker chmcisprbddn01.chm.intra,16020,1533218035807 done with task org.apache.hadoop.hbase.coordination.ZkSplitLogWorkerCoordination$ZkSplitTaskDetails@11ba74a2 in 39ms 2018-08-03 10:49:47,663 INFO [SplitLogWorker-CHMCISPRBDDN01:16020] coordination.ZkSplitLogWorkerCoordination: worker chmcisprbddn01.chm.intra,16020,1533218035807 acquired task /hbase-unsecure/splitWAL/WALs%2Fchmcisprbddn08.chm.intra%2C16020%2C1503542010122-splitting%2Fchmcisprbddn08.chm.intra%252C16020%252C1503542010122.default.1503542017092 2018-08-03 10:49:47,692 INFO [RS_LOG_REPLAY_OPS-CHMCISPRBDDN01:16020-0] wal.WALSplitter: Splitting wal: hdfs://prodcluster/apps/hbase/data/WALs/chmcisprbddn08.chm.intra,16020,1503542010122-splitting/chmcisprbddn08.chm.intra%2C16020%2C1503542010122.default.1503542017092, length=153 2018-08-03 10:49:47,692 INFO [RS_LOG_REPLAY_OPS-CHMCISPRBDDN01:16020-0] wal.WALSplitter: DistributedLogReplay = false 2018-08-03 10:49:47,693 INFO [RS_LOG_REPLAY_OPS-CHMCISPRBDDN01:16020-0] util.FSHDFSUtils: Recovering lease on dfs file hdfs://prodcluster/apps/hbase/data/WALs/chmcisprbddn08.chm.intra,16020,1503542010122-splitting/chmcisprbddn08.chm.intra%2C16020%2C1503542010122.default.1503542017092 2018-08-03 10:49:47,693 INFO [RS_LOG_REPLAY_OPS-CHMCISPRBDDN01:16020-0] util.FSHDFSUtils: recoverLease=true, attempt=0 on file=hdfs://prodcluster/apps/hbase/data/WALs/chmcisprbddn08.chm.intra,16020,1503542010122-splitting/chmcisprbddn08.chm.intra%2C16020%2C1503542010122.default.1503542017092 after 0ms 2018-08-03 10:49:47,695 INFO [RS_LOG_REPLAY_OPS-CHMCISPRBDDN01:16020-0] wal.WALSplitter: Processed 0 edits across 0 regions; edits skipped=0; log file=hdfs://prodcluster/apps/hbase/data/WALs/chmcisprbddn08.chm.intra,16020,1503542010122-splitting/chmcisprbddn08.chm.intra%2C16020%2C1503542010122.default.1503542017092, length=153, corrupted=false, progress failed=false 2018-08-03 10:49:47,695 WARN [RS_LOG_REPLAY_OPS-CHMCISPRBDDN01:16020-0] regionserver.SplitLogWorker: log splitting of WALs/chmcisprbddn08.chm.intra,16020,1503542010122-splitting/chmcisprbddn08.chm.intra%2C16020%2C1503542010122.default.1503542017092 failed, returning error java.io.IOException: Cannot get log reader at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:355) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:267) at org.apache.hadoop.hbase.wal.WALSplitter.getReader(WALSplitter.java:839) at org.apache.hadoop.hbase.wal.WALSplitter.getReader(WALSplitter.java:763) at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:297) at org.apache.hadoop.hbase.wal.WALSplitter.splitLogFile(WALSplitter.java:235) at org.apache.hadoop.hbase.regionserver.SplitLogWorker$1.exec(SplitLogWorker.java:104) at org.apache.hadoop.hbase.regionserver.handler.WALSplitterHandler.process(WALSplitterHandler.java:72) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.UnsupportedOperationException: Unable to find org.apache.hadoop.hbase.regionserver.wal.WALCellCodec,org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:36) at org.apache.hadoop.hbase.regionserver.wal.WALCellCodec.create(WALCellCodec.java:103) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.getCodec(ProtobufLogReader.java:297) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.initAfterCompression(ProtobufLogReader.java:307) at org.apache.hadoop.hbase.regionserver.wal.ReaderBase.init(ReaderBase.java:82) at org.apache.hadoop.hbase.regionserver.wal.ProtobufLogReader.init(ProtobufLogReader.java:164) at org.apache.hadoop.hbase.wal.WALFactory.createReader(WALFactory.java:303) ... 11 more Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec,org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec at java.net.URLClassLoader.findClass(URLClassLoader.java:381) at java.lang.ClassLoader.loadClass(ClassLoader.java:424) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331) at java.lang.ClassLoader.loadClass(ClassLoader.java:357) at java.lang.Class.forName0(Native Method) at java.lang.Class.forName(Class.java:264) at org.apache.hadoop.hbase.util.ReflectionUtils.instantiateWithCustomCtor(ReflectionUtils.java:32) ... 17 more
Created 08-03-2018 06:33 AM
Seems the Split Task is failing.
2018-08-0310:49:47,695 WARN [RS_LOG_REPLAY_OPS-CHMCISPRBDDN01:16020-0] regionserver.SplitLogWorker: log splitting of WALs/chmcisprbddn08.chm.intra,16020,1503542010122-splitting/chmcisprbddn08.chm.intra%2C16020%2C1503542010122.default.1503542017092 failed, returning error java.io.IOException:Cannotget log readerAlso there seems to be issue with the Class.
Caused by: java.lang.UnsupportedOperationException:Unable to find org.apache.hadoop.hbase.regionserver.wal.WALCellCodec,org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.hbase.regionserver.wal.WALCellCodec,org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec
RegionServer seems to be unable to find the class.
What is the HDP, Ambari and HBase version? Share the rpm -qa if that is okay.
Also, can you check the content of the WAL :
# hdfs dfs -cat /apps/hbase/data/WALs/chmcisprbddn08.chm.intra,16020,1503542010122-splitting/chmcisprbddn08.chm.intra%2C16020%2C1503542010122.default.1503542017092
Created 08-03-2018 06:34 AM
What do you see in the HBase Master logs during the time of these logs in RegionServer?
Created 08-03-2018 08:37 AM
Hi Ravi,
Thanks for writing, below are the required details -
Ambari version - HDP-2.6.1.0
Hbase Version - 1.1.2
Pheonix Version - 4.7.0
Regarding the classes I have checked in Phoenix client jar, jar name - phoenix-4.7.0.2.6.1.0-129-client.jar
Both the classes are present in the above jar, and this phoenix jar is kept on hbase lib folder.
Thanks,
Prashant Verma
Created 08-03-2018 08:41 AM
Details in the specified hdfs directory is below-
PWAL‹ "ProtobufLogWriter*rorg.apache.hadoop.hbase.regionserver.wal.WALCellCodec,org.apache.hadoop.hbase.regionserver.wal.IndexedWALEditCodec LAWP
Created 08-03-2018 08:48 AM
Dump of Hbase Master..
2018-08-03 12:59:58,968 ERROR [MASTER_SERVER_OPERATIONS-CHMCISPRBDSN01:16000-0] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN java.io.IOException: failed log splitting for chmcisprbddn01.chm.intra,16020,1503542011010, will retry at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:378) at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:222) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: error or interrupted while splitting logs in [hdfs://prodcluster/apps/hbase/data/WALs/chmcisprbddn01.chm.intra,16020,1503542011010-splitting] Task = installed = 1 done = 0 error = 1 at org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:290) at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:429) at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:402) at org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:319) at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:215) ... 4 more 2018-08-03 12:59:58,969 WARN [MASTER_SERVER_OPERATIONS-CHMCISPRBDSN01:16000-0] hbase.HBaseConfiguration: Config option "hbase.regionserver.lease.period" is deprecated. Instead, use "hbase.client.scanner.timeout.period" 2018-08-03 12:59:58,971 INFO [MASTER_SERVER_OPERATIONS-CHMCISPRBDSN01:16000-2] handler.ServerShutdownHandler: Splitting logs for chmcisprbddn02.chm.intra,16020,1503542193852 before assignment; region count=0 2018-08-03 12:59:58,973 INFO [MASTER_SERVER_OPERATIONS-CHMCISPRBDSN01:16000-2] master.SplitLogManager: dead splitlog workers [chmcisprbddn02.chm.intra,16020,1503542193852] 2018-08-03 12:59:58,973 INFO [MASTER_SERVER_OPERATIONS-CHMCISPRBDSN01:16000-2] master.SplitLogManager: started splitting 1 logs in [hdfs://prodcluster/apps/hbase/data/WALs/chmcisprbddn02.chm.intra,16020,1503542193852-splitting] for [chmcisprbddn02.chm.intra,16020,1503542193852] 2018-08-03 12:59:58,977 INFO [main-EventThread] coordination.SplitLogManagerCoordination: task /hbase-unsecure/splitWAL/WALs%2Fchmcisprbddn02.chm.intra%2C16020%2C1503542193852-splitting%2Fchmcisprbddn02.chm.intra%252C16020%252C1503542193852.default.1503542200899 acquired by chmcisprbddn07.chm.intra,16020,1533218036449 2018-08-03 12:59:59,005 INFO [main-EventThread] coordination.SplitLogManagerCoordination: task /hbase-unsecure/splitWAL/WALs%2Fchmcisprbddn02.chm.intra%2C16020%2C1503542193852-splitting%2Fchmcisprbddn02.chm.intra%252C16020%252C1503542193852.default.1503542200899 entered state: ERR chmcisprbddn07.chm.intra,16020,1533218036449 2018-08-03 12:59:59,006 WARN [main-EventThread] coordination.SplitLogManagerCoordination: Error splitting /hbase-unsecure/splitWAL/WALs%2Fchmcisprbddn02.chm.intra%2C16020%2C1503542193852-splitting%2Fchmcisprbddn02.chm.intra%252C16020%252C1503542193852.default.1503542200899 2018-08-03 12:59:59,006 WARN [MASTER_SERVER_OPERATIONS-CHMCISPRBDSN01:16000-2] master.SplitLogManager: error while splitting logs in [hdfs://prodcluster/apps/hbase/data/WALs/chmcisprbddn02.chm.intra,16020,1503542193852-splitting] installed = 1 but only 0 done 2018-08-03 12:59:59,006 ERROR [MASTER_SERVER_OPERATIONS-CHMCISPRBDSN01:16000-2] executor.EventHandler: Caught throwable while processing event M_SERVER_SHUTDOWN java.io.IOException: failed log splitting for chmcisprbddn02.chm.intra,16020,1503542193852, will retry at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:378) at org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:222) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.io.IOException: error or interrupted while splitting logs in [hdfs://prodcluster/apps/hbase/data/WALs/chmcisprbddn02.chm.intra,16020,1503542193852-splitting] Task = installed = 1 done = 0 error = 1