- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
HBase snapshots as Map-reduce job input - permission denied
- Labels:
-
Apache HBase
Created on 04-21-2015 12:55 AM - edited 09-16-2022 02:26 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi everybody,
I wonder if someone could explain what is going on internally when I use an HBase snapshot as input for map-reduce as explained in [1] (configured by `initTableSnapshotMapperJob` API described in [2]).
My app does the following
1 create a snapshot using the `HBaseAdmin` API
2 create a new HDFS directory in the user's home
3 calls `initTableSnapshotMapperJob` to configure a TableMapper job to run on the created snapshot
(passing the new directory as the tmp restore directory)
4 sets a few more job parameters (the job creates HFiles for bulk import) and then waits for job completion
5 deletes the temporary directory
The problem I am stuck with is that the initialisation (step 3) throws an exception about writing to /hbase/archive (!), after successfully creating the Region servers for the restored snapshot, in the given tmp directory. The exception is given below [3].
I can see in the job's output that regions servers are created before the exception, and the files from the table restore stay in the directory.
I was not expecting hbase to *write* anything to the hbase directories when using a snapshot with an explicitly-given temporary directory to work with. What can I do to make this work?
All this is tested on a cloudera quickstart VM, btw., but that should not really matter IMHO.
Thanks
Jost
[1] http://www.slideshare.net/enissoz/mapreduce-over-snapshots
[3]
java.util.concurrent.ExecutionException: org.apache.hadoop.security.AccessControlException: Permission denied: user=cloudera, access=WRITE, inode="/hbase/archive":hbase:supergroup:drwxr-xr-x
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkFsPermission(DefaultAuthorizationProvider.java:257)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:238)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.check(DefaultAuthorizationProvider.java:216)
at org.apache.hadoop.hdfs.server.namenode.DefaultAuthorizationProvider.checkPermission(DefaultAuthorizationProvider.java:145)
at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:138)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6286)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6268)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6220)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4087)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4057)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4030)
at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:787)
at org.apache.hadoop.hdfs.server.namenode.AuthorizationProviderProxyClientProtocol.mkdirs(AuthorizationProviderProxyClientProtocol.java:297)
at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:594)
at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)
Created 05-08-2015 09:09 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Ok. so here is the complete situation.
When you run a MR on top of a Snapshot, the MR framework will look at all the inputs and create all the tasks for that. However, those tasks might have to wait for some time to be executed depending on the number of slots available on the cluster vs the number of tasks.
The issue is, if while the tasks are pending one of the input is move/deleted/split/merged, etc. then the splits are not pointing anymore to a valid input and the MR job wil fail.
To avoid that, we have to create al lthe links to all the inputs to make sure HBase keep a reference to those files even if they have to me moved, the same way a snapshot is doing. The issue is, those links have to be in the /hbase folder. And this is why you need the rights for that.
So to be able to run a MR job on top of a snapshot you need a user with reads/writes access to the /hbase folder. This should be fixed in HBase 1.2 (but it's just on the plans for now and you will need to double check wen we will be closer to 1.2).
Also, please keep in mind that doing MR on top of Snapshots bypass all the HBase layers. Therefore, if there is any ACLs or Cell level security activated on the initial table, then will all by bypassed by the MR job. Everything will be readable by the job.
Let me know if you have any other question or if I can help with anything.
HTH.
JMS
Created 05-06-2015 07:49 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi Jost,
I will be looking at this I try to figure what the issue is. Can you please confirm which HBase version you use? I see that it runs with the Cloudera user, are you using the QuickStart VM? If so, can you please let me know which version so I try with the same as you?
Thanks,
JMS
Created 05-06-2015 03:38 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I am using cloudera quickstart VMs (version 5.3.0) for tests. The problem can also be observed in our production system, which runs cloudera 5.2.4.
The exception cannot be seen in all runs, it depends on the previous use of hbase. In the VM where I see the exception, hbase contains around two weeks old data, and some tables have been dropped.
(I suspected that restoring the snapshot triggers an internal archiving and uses the wrong user)
HTH
/ Jost
Created 05-06-2015 10:12 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for the information. I have started the download of the VM. I will have it by tomorrow morning and will test the scenario.
In the meantime, can you please clarify this:
"
The problem I am stuck with is that the initialisation (step 3) throws an exception about writing to /hbase/archive (!), after successfully creating the Region servers for the restored snapshot, in the given tmp directory. The exception is given below [3].
I can see in the job's output that regions servers are created before the exception, and the files from the table restore stay in the directory.
"
What do you mean here by "Region servers"? This MR job should not create any region servers.
Thanks,
JM
Created 05-06-2015 10:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry, that was a bit misleading. No region servers are created. What is in fact created are regions (by a method with prefix ?RegionServer")
Before the stack trace, the output of the job contains messages "INFO regionserver.HRegion: creating HRegion testtable .."
(one of them for the test program, many of them for the real application, as it uses many regions).
/ Jost
Created 05-07-2015 04:13 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
Thanks for the clarification. I have downloaded the VM and I have starting to code an example.
Quick followup question.
By default, there is no right management on the VM.
Can you please confirm you modified those 2 properties?
<property>
<name>dfs.permissions.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions</name>
<value>true</value>
</property>
Else, you should not see nay permission denied. If you have not modified it, can you please share your /etc/hadoop/conf content?
Thanks,
JM
Created 05-07-2015 04:50 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
No, I did not modify any permissions. Also, I cannot find this property in the directory you are requesting (see below).
The folder contents are attached FYI.
/ Jost
- ** - ** - ** - ** - ** - ** - ** - ** - ** - ** - ** - ** - ** -
[cloudera@quickstart test]$ uname -a
Linux quickstart.cloudera 2.6.32-358.el6.x86_64 #1 SMP Fri Feb 22 00:31:26 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
[cloudera@quickstart test]$ grep -r -e dfs.permissions /etc/hadoop/conf
grep: /etc/hadoop/conf/container-executor.cfg: Permission denied
[cloudera@quickstart test]$ sudo grep -r -e dfs.permissions /etc/hadoop/conf
[cloudera@quickstart test]$ ls -lrt /etc/hadoop/conf/
total 44
-rw-r--r-- 1 root root 3906 Apr 20 19:37 yarn-site.xml
-rw-r--r-- 1 root root 315 Apr 20 19:37 ssl-client.xml
-rw-r--r-- 1 root root 4391 Apr 20 19:37 mapred-site.xml
-rw-r--r-- 1 root root 300 Apr 20 19:37 log4j.properties
-rw-r--r-- 1 root root 1669 Apr 20 19:37 hdfs-site.xml
-rw-r--r-- 1 root root 425 Apr 20 19:37 hadoop-env.sh
-rw-r--r-- 1 root root 3675 Apr 20 19:37 core-site.xml
-rw-r--r-- 1 root root 21 Apr 20 19:37 __cloudera_generation__
-r-------- 1 root hadoop 0 May 6 22:21 container-executor.cfg
-rwxr-xr-x 1 root hadoop 1510 May 6 22:21 topology.py
-rw-r--r-- 1 root hadoop 200 May 6 22:21 topology.map
[cloudera@quickstart test]$
Created 05-07-2015 04:54 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Default value is true so if property is not there that mean rights are on.
[cloudera@quickstart ~]$ ls -lrt /etc/hadoop/conf/
total 40
-rwxr-xr-x 1 root root 2375 Dec 3 01:39 yarn-site.xml
-rwxr-xr-x 1 root root 1104 Dec 3 01:39 README
-rwxr-xr-x 1 root root 2890 Dec 3 01:39 hadoop-metrics.properties
-rwxr-xr-x 1 root root 1366 Dec 3 01:39 hadoop-env.sh
-rwxr-xr-x 1 root root 11291 Dec 16 19:26 log4j.properties
-rw-rw-r-- 1 root root 1546 Dec 17 12:55 mapred-site.xml
-rw-rw-r-- 1 root root 1915 Dec 17 12:55 core-site.xml
-rw-rw-r-- 1 root root 3737 May 7 16:06 hdfs-site.xml
The files we have are a bit different. Have you activated CM?
I have extracted you CDH java branch locally and will dig into the code. I looked at 1.0 and I saw nothing wrong. But you are running on 0.98.6.
I will provide a feedback shortly.
Thanks,
JM
Created 05-07-2015 04:57 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Let me know if you still need /etc/hadoop/conf contents (which is actually a link to /etc/alternatives/hadoop-conf)
I am certain that I did not modify it (not consciously or manually, that is 🙂 ) it should be the default CDH-5.3.0 quickstart one.
/ Jost
Created 05-08-2015 05:51 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
FYI,
I'm able to reproduce the issue.
Steps:
1) Download 5.3.0 VM,
2) Change hadoop-site.xml to manage permissions,
3) Create and fill a table,
4) Create snapshot,
5) Try to MR over it.
I'm now debugging step by step to see where it's coming from.
Can you please send me your TableMapReduceUtil.initTableSnapshotMapperJob line?
Thanks,
JM
