Support Questions
Find answers, ask questions, and share your expertise

HDP 3.0 HBase backup issues

New Contributor

I'm running a clean HDP 3.0 single node test cluster with HBase, YARN and the other minimal requirements, and I'm having an issue with HBase backup / restore. This is how I did the setup:

# Store secrets.
hadoop credential create fs.s3a.access.key -value 'XXX' -provider localjceks://file/usr/hdp/jceks/aws.jceks
hadoop credential create fs.s3a.secret.key -value 'YYY' -provider localjceks://file/usr/hdp/jceks/aws.jceks
chmod 644 /usr/hdp/jceks/aws.jceks  # Not great for security purposes, just me being lazy for testing.

# Set the following setting in Custom core-site in Ambari.
hadoop.security.credential.provider.path=localjceks://file/usr/hdp/jceks/aws.jceks

# Set the following settings in Custom hbase-site in Ambari.
hbase.backup.enable=true
hbase.master.logcleaner.plugins=org.apache.hadoop.hbase.backup.master.BackupLogCleaner
hbase.procedure.master.classes=org.apache.hadoop.hbase.backup.master.LogRollMasterProcedureManager
hbase.procedure.regionserver.classes=org.apache.hadoop.hbase.backup.regionserver.LogRollRegionServerProcedureManager

# Set the following setting in Advanced hbase-site in Ambari.
hbase.coprocessor.region.classes=org.apache.hadoop.hbase.security.access.SecureBulkLoadEndpoint,org.apache.hadoop.hbase.backup.BackupObserver

Once this was done I created a table 't' and put one row in it. I then issued the following command:

hbase backup create full s3a://erikn-hdp-backup/backup

I then put another row in table 't' and issues the following command:

hbase backup create incremental s3a://erikn-hdp-backup/backup

I then disabled and dropped the table 't' and tried to restore it using the following command:

# backup_1536892696902 was the second (incremental backup).
hbase restore s3a://erikn-hdp-backup/backup backup_1536892696902 -t t

What I got back was this (partial logs):

2018-09-14 02:47:08,742 INFO  [main] impl.RestoreTablesClient: Restoring 't' to 't' from full backup image s3a://erikn-hdp-backup/backup/backup_1536892302737/default/t
2018-09-14 02:47:11,229 INFO  [main] util.BackupUtils: Creating target table 't'
...
map & reduce steps of the full backup completed successfully
...
2018-09-14 02:48:14,391 WARN  [main] tool.LoadIncrementalHFiles: Skipping non-directory hdfs://hdp.local.xxx.com:8020/user/hbase/hbase-staging/bulk_output-default-t-1536893235078/_SUCCESS
2018-09-14 02:48:14,492 WARN  [main] tool.LoadIncrementalHFiles: SecureBulkLoadEndpoint is deprecated. It will be removed in future releases.
2018-09-14 02:48:14,492 WARN  [main] tool.LoadIncrementalHFiles: Secure bulk load has been integrated into HBase core.
2018-09-14 02:48:14,572 INFO  [LoadIncrementalHFiles-0] hfile.CacheConfig: Created cacheConfig: CacheConfig:disabled
2018-09-14 02:48:14,581 INFO  [LoadIncrementalHFiles-0] tool.LoadIncrementalHFiles: Trying to load hfile=hdfs://hdp.local.xxx.com:8020/user/hbase/hbase-staging/bulk_output-default-t-1536893235078/d/7d8ffd27478240629082c9404c2fb792 first=Optional[2] last=Optional[2]
2018-09-14 02:48:14,591 INFO  [LoadIncrementalHFiles-0] hfile.CacheConfig: Created cacheConfig: CacheConfig:disabled
2018-09-14 02:48:14,598 INFO  [LoadIncrementalHFiles-0] tool.LoadIncrementalHFiles: Trying to load hfile=hdfs://hdp.local.xxx.com:8020/user/hbase/hbase-staging/bulk_output-default-t-1536893235078/d/dcef505757f84982a9de36d950d03e6a first=Optional[1] last=Optional[1]
java.lang.IllegalArgumentException: Wrong FS: s3a://erikn-hdp-backup/backup/backup_1536892696902/default/t, expected: hdfs://hdp.local.xxx.com:8020
		at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:781)
		at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedFileSystem.java:240)
		at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1579)
		at org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1576)
		at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
		at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1591)
		at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1734)
		at org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.restoreImages(RestoreTablesClient.java:177)
		at org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.restore(RestoreTablesClient.java:241)
		at org.apache.hadoop.hbase.backup.impl.RestoreTablesClient.execute(RestoreTablesClient.java:285)
		at org.apache.hadoop.hbase.backup.impl.BackupAdminImpl.restore(BackupAdminImpl.java:514)
		at org.apache.hadoop.hbase.backup.RestoreDriver.parseAndRun(RestoreDriver.java:182)
		at org.apache.hadoop.hbase.backup.RestoreDriver.doWork(RestoreDriver.java:220)
		at org.apache.hadoop.hbase.backup.RestoreDriver.run(RestoreDriver.java:262)
		at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
		at org.apache.hadoop.hbase.backup.RestoreDriver.main(RestoreDriver.java:231)
2018-09-14 02:48:15,360 INFO  [pool-4-thread-1] impl.MetricsSystemImpl: Stopping s3a-file-system metrics system...
2018-09-14 02:48:15,360 INFO  [pool-4-thread-1] impl.MetricsSystemImpl: s3a-file-system metrics system stopped.
2018-09-14 02:48:15,361 INFO  [pool-4-thread-1] impl.MetricsSystemImpl: s3a-file-system metrics system shutdown complete.

As you can see the full backup restore worked, but the incremental one failed because of "Wrong FS". This seems like a straight up bug in how the backup / restore is implemented, or am I doing something wrong?

I feel the documentation is a bit lacking, but I've mainly used the docs here:

https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/hbase-data-access/content/command-creating-...

https://docs.hortonworks.com/HDPDocuments/HDP3/HDP-3.0.0/hbase-data-access/content/BAR-use-case.html