Member since
04-30-2018
12
Posts
0
Kudos Received
0
Solutions
01-09-2020
06:12 AM
One of my Hbase tables is in a limbo with a Truncate command fired, not completing and getting hung. On running hbck, I can see the following error:
2020-01-09 11:54:55,741 INFO [hbasefsck-pool1-t1] util.HBaseFsck: Region { meta => table_1,,1578390508424.a96068fc6ecaa2c66882df211b274e60., hdfs => hdfs://my-hdfs/apps/hbase/data/data/default/table_1/a96068fc6ecaa2c66882df211b274e60, deployed => , replicaId => 0 } is in META, and in a disabled tabled that is not deployed
2020-01-09 11:54:56,098 DEBUG [main-SendThread(machine1-bd.org.com:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x26f81dc781e001a, packet:: clientPath:null serverPath:null finished:false header:: 23,8 replyHeader:: 23,146028924982,0 request:: '/hbase/table-lock/table_1,F response:: v{'write-master:160000000000016}
2020-01-09 11:54:56,098 DEBUG [main-SendThread(machine1-bd.org.com:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x26f81dc781e001a, packet:: clientPath:null serverPath:null finished:false header:: 24,8 replyHeader:: 24,146028924982,0 request:: '/hbase/table-lock/table_1,F response:: v{'write-master:160000000000016}
2020-01-09 11:54:56,099 INFO [main] lock.ZKInterProcessLockBase: Lock is held by: write-master:160000000000016
2020-01-09 11:54:56,099 DEBUG [main-SendThread(machine1-bd.org.com:2181)] zookeeper.ClientCnxn: Reading reply sessionid:0x26f81dc781e001a, packet:: clientPath:null serverPath:null finished:false header:: 25,4 replyHeader:: 25,146028924982,0 request:: '/hbase/table-lock/table_1/write-master:160000000000016,F response:: #ffffffff000146d61737465723a3136303030ffffffb679ffffffd411152b5050425546a14a764656661756c741296275636b6574735f321224a1864656d756331707233322e64652e7072692e6f322e636f6d10ffffff807d18fffffff8ffffffdbffffffb5ffffff8ffffffff82d18ffffff9a12002ae7472756e63617465207461626c6530ffffffb6ffffffbcffffffcfffffff91fffffff82d,s{146028892170,146028892170,1578437434934,1578437434934,0,0,0,103444100363649031,117,0,146028892170}
ERROR: Table lock acquire attempt found:[tableName=default:table_1, lockOwner=mahchine1.org.com,16000,1578432818680, threadId=154, purpose=truncate table, isShared=false, createTime=1578437434934]
I'm trying to get the get the command to terminate/stop. Can 'hbase hbck -fix' be used for this scenario? But it does not seem to work. Any help is much appreciated.
... View more
Labels:
01-08-2020
02:27 AM
Hi, I have a cluster running Hbase with 1 Master and 6 RS. Recently I noticed that Hbase commands are sort of queuing up and possibly hung(RUNNABLE state) for one table. When checking the state of the table, I an see that the table is disabled. The command executed to enable the table failed when executing from the Hbase shell stating that process id xxx(previous truncate command) is already running. In the HBase UI, I can see both the truncate and enable commands(shown under Procedures) is in RUNNABLE state. I tried the kill procedure command on the truncate command but it returns false, indicating the process cannot be killed I have data on other tables and scan commands on these tables work fine. What might be the issue here, how can I kill the command running at Hbase and get the table back to working? Any Help is much appreciated. Regards, Thomas On checking the Master log, I can see the following Warnings: 2020-01-08 11:03:28,168 WARN [RegionOpenAndInitThread-buckets_2-10] ipc.Client: interrupted waiting to send rpc request to server java.lang.InterruptedException at java.util.concurrent.FutureTask.awaitDone(FutureTask.java:404) at java.util.concurrent.FutureTask.get(FutureTask.java:191) at org.apache.hadoop.ipc.Client$Connection.sendRpcRequest(Client.java:1094) at org.apache.hadoop.ipc.Client.call(Client.java:1457) at org.apache.hadoop.ipc.Client.call(Client.java:1398) at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:233) at com.sun.proxy.$Proxy16.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getFileInfo(ClientNamenodeProtocolTranslatorPB.java:818) at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:291) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:203) at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:185) at com.sun.proxy.$Proxy17.getFileInfo(Unknown Source) at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:283) at com.sun.proxy.$Proxy18.getFileInfo(Unknown Source) at sun.reflect.GeneratedMethodAccessor1.invoke(Unknown Source) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.hadoop.hbase.fs.HFileSystem$1.invoke(HFileSystem.java:283) at com.sun.proxy.$Proxy18.getFileInfo(Unknown Source) at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2165) at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1442) at org.apache.hadoop.hdfs.DistributedFileSystem$26.doCall(DistributedFileSystem.java:1438) at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1438) at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1447) at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.createRegionOnFileSystem(HRegionFileSystem.java:898) at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:6364) at org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegion(ModifyRegionUtils.java:205) at org.apache.hadoop.hbase.util.ModifyRegionUtils$1.call(ModifyRegionUtils.java:173) at org.apache.hadoop.hbase.util.ModifyRegionUtils$1.call(ModifyRegionUtils.java:170) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) ...... 2020-01-08 11:03:28,438 DEBUG [WALProcedureStoreSyncThread] wal.WALProcedureStore: Roll new state log: 64132 2020-01-08 11:03:28,683 DEBUG [ProcedureExecutorThread-28] util.FSTableDescriptors: Current tableInfoPath = hdfs://my-hdfs/apps/hbase/data/.tmp/data/default/buckets_2/.tabledesc/.tableinfo.0000000001 2020-01-08 11:03:28,685 DEBUG [ProcedureExecutorThread-28] util.FSTableDescriptors: TableInfo already exists.. Skipping creation 2020-01-08 11:03:28,685 INFO [RegionOpenAndInitThread-buckets_2-1] regionserver.HRegion: creating HRegion buckets_2 HTD == 'buckets_2', {NAME => 'b', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} RootDir = hdfs://my-hdfs/apps/hbase/data/.tmp Table name == buckets_2 2020-01-08 11:03:28,685 INFO [RegionOpenAndInitThread-buckets_2-2] regionserver.HRegion: creating HRegion buckets_2 HTD == 'buckets_2', {NAME => 'b', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} RootDir = hdfs://my-hdfs/apps/hbase/data/.tmp Table name == buckets_2 2020-01-08 11:03:28,685 INFO [RegionOpenAndInitThread-buckets_2-3] regionserver.HRegion: creating HRegion buckets_2 HTD == 'buckets_2', {NAME => 'b', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} RootDir = hdfs://my-hdfs/apps/hbase/data/.tmp Table name == buckets_2 2020-01-08 11:03:28,685 INFO [RegionOpenAndInitThread-buckets_2-4] regionserver.HRegion: creating HRegion buckets_2 HTD == 'buckets_2', {NAME => 'b', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} RootDir = hdfs://my-hdfs/apps/hbase/data/.tmp Table name == buckets_2 2020-01-08 11:03:28,686 INFO [RegionOpenAndInitThread-buckets_2-5] regionserver.HRegion: creating HRegion buckets_2 HTD == 'buckets_2', {NAME => 'b', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} RootDir = hdfs://my-hdfs/apps/hbase/data/.tmp Table name == buckets_2 .... 2020-01-08 11:03:28,686 WARN [RegionOpenAndInitThread-buckets_2-3] regionserver.HRegionFileSystem: Trying to create a region that already exists on disk: hdfs://my-hdfs/apps/hbase/data/.tmp/data/default/buckets_2/2e97482e25a07a7eb17a113535474057 2020-01-08 11:03:28,686 WARN [RegionOpenAndInitThread-buckets_2-2] regionserver.HRegionFileSystem: Trying to create a region that already exists on disk: hdfs://my-hdfs/apps/hbase/data/.tmp/data/default/buckets_2/fbfd55d8d1852193a22875679852b1f2 2020-01-08 11:03:28,686 INFO [RegionOpenAndInitThread-buckets_2-3] regionserver.HRegion: creating HRegion buckets_2 HTD == 'buckets_2', {NAME => 'b', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} RootDir = hdfs://my-hdfs/apps/hbase/data/.tmp Table name == buckets_2 2020-01-08 11:03:28,686 INFO [RegionOpenAndInitThread-buckets_2-2] regionserver.HRegion: creating HRegion buckets_2 HTD == 'buckets_2', {NAME => 'b', BLOOMFILTER => 'ROW', VERSIONS => '1', IN_MEMORY => 'false', KEEP_DELETED_CELLS => 'FALSE', DATA_BLOCK_ENCODING => 'NONE', TTL => 'FOREVER', COMPRESSION => 'NONE', MIN_VERSIONS => '0', BLOCKCACHE => 'true', BLOCKSIZE => '65536', REPLICATION_SCOPE => '0'} RootDir = hdfs://my-hdfs/apps/hbase/data/.tmp Table name == buckets_2 2020-01-08 11:03:28,686 WARN [ProcedureExecutorThread-28] procedure.TruncateTableProcedure: Retriable error trying to truncate table=buckets_2 state=TRUNCATE_TABLE_CREATE_FS_LAYOUT java.io.IOException: java.util.concurrent.ExecutionException: java.io.IOException: The specified region already exists on disk: hdfs://my-hdfs/apps/hbase/data/.tmp/data/default/buckets_2/2e97482e25a07a7eb17a113535474057 at org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:186) at org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:141) at org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:118) at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure$3.createHdfsRegions(CreateTableProcedure.java:361) at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:380) at org.apache.hadoop.hbase.master.procedure.CreateTableProcedure.createFsLayout(CreateTableProcedure.java:354) at org.apache.hadoop.hbase.master.procedure.TruncateTableProcedure.executeFromState(TruncateTableProcedure.java:113) at org.apache.hadoop.hbase.master.procedure.TruncateTableProcedure.executeFromState(TruncateTableProcedure.java:47) at org.apache.hadoop.hbase.procedure2.StateMachineProcedure.execute(StateMachineProcedure.java:107) at org.apache.hadoop.hbase.procedure2.Procedure.doExecute(Procedure.java:500) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1086) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:888) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execLoop(ProcedureExecutor.java:841) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$200(ProcedureExecutor.java:77) at org.apache.hadoop.hbase.procedure2.ProcedureExecutor$1.run(ProcedureExecutor.java:443) Caused by: java.util.concurrent.ExecutionException: java.io.IOException: The specified region already exists on disk: hdfs://my-hdfs/apps/hbase/data/.tmp/data/default/buckets_2/2e97482e25a07a7eb17a113535474057 at java.util.concurrent.FutureTask.report(FutureTask.java:122) at java.util.concurrent.FutureTask.get(FutureTask.java:192) at org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegions(ModifyRegionUtils.java:180) ... 14 more Caused by: java.io.IOException: The specified region already exists on disk: hdfs://my-hdfs/apps/hbase/data/.tmp/data/default/buckets_2/2e97482e25a07a7eb17a113535474057 at org.apache.hadoop.hbase.regionserver.HRegionFileSystem.createRegionOnFileSystem(HRegionFileSystem.java:900) at org.apache.hadoop.hbase.regionserver.HRegion.createHRegion(HRegion.java:6364) at org.apache.hadoop.hbase.util.ModifyRegionUtils.createRegion(ModifyRegionUtils.java:205) at org.apache.hadoop.hbase.util.ModifyRegionUtils$1.call(ModifyRegionUtils.java:173) at org.apache.hadoop.hbase.util.ModifyRegionUtils$1.call(ModifyRegionUtils.java:170) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 2020-01-08 11:03:28,686 WARN [RegionOpenAndInitThread-buckets_2-5] regionserver.HRegionFileSystem: Trying to create a region that already exists on disk: hdfs://my-hdfs/apps/hbase/data/.tmp/data/default/buckets_2/6b0206739eeaeae1894fd54d36986c6e 2020-01-08 11:03:28,686 WARN [RegionOpenAndInitThread-buckets_2-2] ipc.Client: interrupted waiting to send rpc request to server java.lang.InterruptedException ......
... View more
Labels:
- Labels:
-
Apache HBase