Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

HBase master errors HFileArchiver

avatar
Contributor

Hello,

 

CDH 5.3.3 distribution

 

I have following error messages repeating in HBase master logs:

2015-07-21 16:36:20,964 WARN org.apache.hadoop.hbase.backup.HFileArchiver: Failed to archive class org.apache.hadoop.hbase.backup.HFileArchiver$FileablePath, file:hdfs://server:8020/hbase/data/default/olap_cube/4dc9d309b1a9089
5260662f493799e45/metric/fcd52caf2c364269939dff6c4962e76b_SeqId_168130436_ on try #2
org.apache.hadoop.security.AccessControlException: Permission denied: user=hbase, access=WRITE, inode="/hbase/data/default/olap_cube/4dc9d309b1a90895260662f493799e45/metric/fcd52caf2c364269939dff6c4962e76b_SeqId_168130436_":justAnotherUser:hado
op:-rw-r--r--
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
        at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:151)
        at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6287)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6269)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPathAccess(FSNamesystem.java:6194)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setTimesInt(FSNamesystem.java:2159)
        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.setTimes(FSNamesystem.java:2137)
        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.setTimes(NameNodeRpcServer.java:991)
        at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.setTimes(AuthorizationProviderProxyClientProtocol.java:576)
        at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.setTimes(ClientNamenodeProtocolServerSideTranslatorPB.java:885)
        at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

        at sun.reflect.GeneratedConstructorAccessor14.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
        at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
        at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
        at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
        at org.apache.hadoop.hdfs.DFSClient.setTimes(DFSClient.java:2756)
        at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1304)
        at org.apache.hadoop.hdfs.DistributedFileSystem$25.doCall(DistributedFileSystem.java:1300)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.setTimes(DistributedFileSystem.java:1300)
        at org.apache.hadoop.hbase.util.FSUtils.renameAndSetModifyTime(FSUtils.java:1678)
        at org.apache.hadoop.hbase.backup.HFileArchiver$File.moveAndClose(HFileArchiver.java:586)
        at org.apache.hadoop.hbase.backup.HFileArchiver.resolveAndArchiveFile(HFileArchiver.java:425)
        at org.apache.hadoop.hbase.backup.HFileArchiver.resolveAndArchive(HFileArchiver.java:335)
        at org.apache.hadoop.hbase.backup.HFileArchiver.resolveAndArchive(HFileArchiver.java:347)
        at org.apache.hadoop.hbase.backup.HFileArchiver.resolveAndArchive(HFileArchiver.java:284)
        at org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:137)
        at org.apache.hadoop.hbase.backup.HFileArchiver.archiveRegion(HFileArchiver.java:75)
        at org.apache.hadoop.hbase.master.CatalogJanitor.cleanParent(CatalogJanitor.java:333)
        at org.apache.hadoop.hbase.master.CatalogJanitor.scan(CatalogJanitor.java:254)
        at org.apache.hadoop.hbase.master.CatalogJanitor.chore(CatalogJanitor.java:101)
        at org.apache.hadoop.hbase.Chore.run(Chore.java:87)
        at java.lang.Thread.run(Thread.java:724)

I change ownership of these files manually to 'hbase' user, but after a while errors return with just different files (for other HBase tables too).

 

I suspect, that this problem occurs because HBase wants to compact these files, but permissions are incorrect. Our Oozie workflows containing bulkload are run with 'justAnotherUser' instead of 'hbase' user. This is because we prepare data and load it in HBase in same workflow and we require, that the running user is 'justAnotherUser'.

 

Did I guess it right? What should I do to load data to HBase correctly? Should I separate data preparation and bulkload to a different workflows? Maybe it is possible to specify user for bulkload alone? Or is it completely different issue.

1 ACCEPTED SOLUTION

avatar
Mentor
It should be possible to perform SecureBulkLoad without enabling Kerberos,
although I have not personally tested this mix. The config steps may
involve just configuring the SecureBulkLoad end-point and the staging
directory configs on the RSes and clients.

View solution in original post

3 REPLIES 3

avatar
Mentor
How are you bulk loading, specifically? Could you chown the prepared files
as 'hbase' before you trigger the bulk load, or use the SecureBulkLoad
technique offered by HBase?

avatar
Contributor

Thanks for your reply! We are generating HFiles and then doing this:

LoadIncrementalHFiles loader = new LoadIncrementalHFiles(conf);
loader.doBulkLoad(new Path(hfilePath), table);

'chown' files before bulkload is one way to do that, thanks for the information! I just wonder if it's the best way. Since not all of our clusters use secure mode, is it possible to implement SecureBulkLoad on non-secure cluster?

avatar
Mentor
It should be possible to perform SecureBulkLoad without enabling Kerberos,
although I have not personally tested this mix. The config steps may
involve just configuring the SecureBulkLoad end-point and the staging
directory configs on the RSes and clients.