Support Questions

Jost Berthold · ‎04-21-2015

Hi everybody,

I wonder if someone could explain what is going on internally when I use an HBase snapshot as input for map-reduce as explained in [1] (configured by `initTableSnapshotMapperJob` API described in [2]).

My app does the following

1 create a snapshot using the `HBaseAdmin` API

2 create a new HDFS directory in the user's home

3 calls `initTableSnapshotMapperJob` to configure a TableMapper job to run on the created snapshot

(passing the new directory as the tmp restore directory)

4 sets a few more job parameters (the job creates HFiles for bulk import) and then waits for job completion

5 deletes the temporary directory

The problem I am stuck with is that the initialisation (step 3) throws an exception about writing to /hbase/archive (!), after successfully creating the Region servers for the restored snapshot, in the given tmp directory. The exception is given below [3].

I can see in the job's output that regions servers are created before the exception, and the files from the table restore stay in the directory.

I was not expecting hbase to *write* anything to the hbase directories when using a snapshot with an explicitly-given temporary directory to work with. What can I do to make this work?

All this is tested on a cloudera quickstart VM, btw., but that should not really matter IMHO.

Thanks

Jost

[1] http://www.slideshare.net/enissoz/mapreduce-over-snapshots

[2] https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/mapreduce/TableMapReduceUtil.html#initTable...

[3]

java.util.concurrent.ExecutionException: org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=WRITE, inode="/hbase/archive":hbase:supergroup:drwxr-xr-x

at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:216)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:145)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6286)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6268)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6220)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4087)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4057)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4030)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:787)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:297)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:594)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

jmspaggi · ‎05-08-2015

Ok. so here is the complete situation.

When you run a MR on top of a Snapshot, the MR framework will look at all the inputs and create all the tasks for that. However, those tasks might have to wait for some time to be executed depending on the number of slots available on the cluster vs the number of tasks.

The issue is, if while the tasks are pending one of the input is move/deleted/split/merged, etc. then the splits are not pointing anymore to a valid input and the MR job wil fail.

To avoid that, we have to create al lthe links to all the inputs to make sure HBase keep a reference to those files even if they have to me moved, the same way a snapshot is doing. The issue is, those links have to be in the /hbase folder. And this is why you need the rights for that.

So to be able to run a MR job on top of a snapshot you need a user with reads/writes access to the /hbase folder. This should be fixed in HBase 1.2 (but it's just on the plans for now and you will need to double check wen we will be closer to 1.2).

Also, please keep in mind that doing MR on top of Snapshots bypass all the HBase layers. Therefore, if there is any ACLs or Cell level security activated on the initial table, then will all by bypassed by the MR job. Everything will be readable by the job.

Let me know if you have any other question or if I can help with anything.

HTH.

JMS

View solution in original post

jmspaggi · ‎05-08-2015

Ok. so here is the complete situation.

When you run a MR on top of a Snapshot, the MR framework will look at all the inputs and create all the tasks for that. However, those tasks might have to wait for some time to be executed depending on the number of slots available on the cluster vs the number of tasks.

The issue is, if while the tasks are pending one of the input is move/deleted/split/merged, etc. then the splits are not pointing anymore to a valid input and the MR job wil fail.

To avoid that, we have to create al lthe links to all the inputs to make sure HBase keep a reference to those files even if they have to me moved, the same way a snapshot is doing. The issue is, those links have to be in the /hbase folder. And this is why you need the rights for that.

So to be able to run a MR job on top of a snapshot you need a user with reads/writes access to the /hbase folder. This should be fixed in HBase 1.2 (but it's just on the plans for now and you will need to double check wen we will be closer to 1.2).

Also, please keep in mind that doing MR on top of Snapshots bypass all the HBase layers. Therefore, if there is any ACLs or Cell level security activated on the initial table, then will all by bypassed by the MR job. Everything will be readable by the job.

Let me know if you have any other question or if I can help with anything.

HTH.

JMS

Jost Berthold · ‎05-10-2015

Hi Jean-Marc,

Thanks for your thorough analysis!
Making sure the HFiles stay around makes perfect sense, so it is just a permissions issue.
And hopefully this will be fixed with HBase 1.2 then ? I will use a permissions workaround meanwhile.

Best regards
Jost

Cloudera Community

Support Questions

HBase snapshots as Map-reduce job input - permission denied