Created on 08-27-2018 07:33 PM - edited 09-16-2022 06:38 AM
Hbase version : 0.92.1-cdh4.0.1
I need to delete outdated files due to lack of Hadoop system capacity.
When I checked the usage of HBase, I was using about 50% in the .archive directory.
19,613G /hbase
10,055G /hbase/.archive
What is the .archive directory?
Can I lower the usage of this directory without losing data?
Created 08-28-2018 06:37 PM
Created 09-06-2018 01:07 AM
Created 08-27-2018 08:04 PM
Created on 08-28-2018 06:32 PM - edited 08-28-2018 06:37 PM
version is hbase-0.92.1 cdh4.01.
Does this version contain an archive directory?
A subdirectory in the archive directory.
/hbase/.archive/-ROOT-
/hbase/.archive/.META.
/hbase/.archive/[tablename]
/hbase/.archive/[tablename]
/hbase/.archive/[tablename]
.....
There is no snapshot directory.
The hbase shell does not have a list_snapshots command.
Created 08-28-2018 06:37 PM
Created on 08-28-2018 06:52 PM - edited 08-28-2018 07:22 PM
I perform major compaction twice a week.
Could the archive directory be related to compaction?
I checked the contents of jira below, but I do not know if it affects my hbase version either.
https://issues.apache.org/jira/browse/HBASE-10371
Also, I found the following warning log in the hbase master log:
WARN: org.apache.hadoop.hbase.util.FSTableDescriptors: the following folder is in hbase's root directory and doesn't contain a table descriptor, do consider deleting it : .archive
Created on 09-05-2018 11:13 PM - edited 09-05-2018 11:28 PM
Maybe my CDH version is not 4.0.1.
I found singularities in the cloudera manager.
I checked the information of each host from the following menu of Claude Manager.
[Cloudera manager > Hosts > (Click one host) > Components ]
The version of installed components on each host was:
----------------------------------------------
Clouudera Manager Agent 4.0.4
Cloudera Manager Management Daemons 4.0.4
HBase 0.92.1+67
----------------------------------------------
And one version of HBase's Region Server is "0.94.15 + 114(0.94.15-cdh4.7.0)", which is different from other Region Servers.
I want to free up space by deleting the '.archive' directory.
Can I delete the '.archive' directory myself?
If so, can I proceed without interruption of the HBase service?
Or should I delete the directory and restart Hbase after I have stopped HBase?
Can I disable it after deleting the '.archive' directory?
I need your help.
Created 09-06-2018 01:07 AM
Created on 09-06-2018 01:16 AM - edited 09-06-2018 01:24 AM
In addition, I have seen logs from the hadoop hdfs audit log to access the .archive directory.
... ugi=hbase(auth:SIMPLE) ip=/192.16.1.150 cmd=mkdir src=/hbase/.archive/tableName/99082b8b557...
... ugi=hbase(auth:SIMPLE) ip=/192.16.1.150 cmd=rename src=/hbase/tableName/99082b8b557... dst=/hbase/.archive/tableName/99082b8b557...
The 192.16.1.150 server is one server with a different version(0.94.15+114).
-----------------------------
This is the result of running the list_snapshots command in the hbase shell of 192.16.1.150 as you said.
hbase(main):001:0> list_snapshots
SNAPSHOT TABLE + CREATION TIME
ERROR: java.io.IOException: java.io.IOException: java.lang.NoSuchMethodException: org.apache.hadoop.hbase.ipc.HMasterInterface.listSnapshots()
at java.lang.Class.getMethod(Class.java:1605)
at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:334)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1336)
Here is some help for this command:
List all snapshots taken (by printing the names and relative information).
Optional regular expression parameter could be used to filter the output
by snapshot name.
Examples:
hbase> list_snapshots
hbase> list_snapshots 'abc.*'
Created 09-27-2018 12:23 AM
Created 09-27-2018 12:39 AM