Support Questions

Find answers, ask questions, and share your expertise

Falcon fails to create feed with HDFS permission error

avatar
Rising Star

Hi,

our Falcon installation abruptly ceased to work and no feed could be created. It complained about file permission

Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException): Permission denied: user=kefi, access=WRITE, inode="/apps/falcon-MiddleGate/staging/falcon/workflows/feed":middlegate_test1:falcon:drwxr-xr-x
   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkFsPermission(FSPermissionChecker.java:271)
   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:257)
   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:238)
   at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:179)
   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6515)
   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkPermission(FSNamesystem.java:6497)
   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkAncestorAccess(FSNamesystem.java:6449)
   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInternal(FSNamesystem.java:4251)
   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirsInt(FSNamesystem.java:4221)
   at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.mkdirs(FSNamesystem.java:4194)
   at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.mkdirs(NameNodeRpcServer.java:813)
   at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.mkdirs(ClientNamenodeProtocolServerSideTranslatorPB.java:600)
   at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
   at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:619)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:962)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2039)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2035)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.ja

where 'kefi' is the user trying to create the feed and 'middlegate_test1' is another user that created some feed before.

the folders on hdfs looked like this

bash-4.1$ hadoop fs -ls /apps/falcon-MiddleGate/staging/falcon/workflows/
Found 2 items
drwxr-xr-x   - middlegate_test1 falcon          0 2015-12-02 09:13 /apps/falcon-MiddleGate/staging/falcon/workflows/feed
drwxrwxrwx   - middlegate_test1 falcon          0 2015-12-02 09:13 /apps/falcon-MiddleGate/staging/falcon/workflows/process

I can think of two questions related to this:

  • why the permission for the 'feed' folder is now 'drwxr-xr-x' whereas the 'process' folder has permissions 'drwxrwxrwx'? The feed creation has been working before, so I guess someone or something had to change it. It is not very probable that some user did it manually, is it possible that it was falcon itself who did it?
  • it does not seem correct that such internal falcon system folders are owned by some normal user; quite probably the first one who ever tried to create some entity in falcon. Is it as expected, or rather some our misconfiguration of falcon?

Thanks for any input,

Regards,

Pavel

1 ACCEPTED SOLUTION

avatar
Super Collaborator

@Pavel Benes For Falcon to work, it is required that user must create the staging and working specified in the cluster entity before submitting the cluster entity. Staging directory must have permission 777 and working directory must have 755.

From the exception, it looks like that the required staging directory own by user "middlegate_test1" and user "kefi" could not able to write to that. To solve this issue, try to submit the cluster entity with user "kefi" to that cluster and then submit/schedule the feed/process entity.

For me when I submit the feed/process entity, permission of feed and process directory is 755 under staging directory that have permission 777. I don't think so that Falcon has done the permission change for feed/process, under staging directory, which usually contains the configuration xml's, logs file and jars files etc. required for executing the feed/process entity.

View solution in original post

5 REPLIES 5

avatar
Super Collaborator

@Pavel Benes For Falcon to work, it is required that user must create the staging and working specified in the cluster entity before submitting the cluster entity. Staging directory must have permission 777 and working directory must have 755.

From the exception, it looks like that the required staging directory own by user "middlegate_test1" and user "kefi" could not able to write to that. To solve this issue, try to submit the cluster entity with user "kefi" to that cluster and then submit/schedule the feed/process entity.

For me when I submit the feed/process entity, permission of feed and process directory is 755 under staging directory that have permission 777. I don't think so that Falcon has done the permission change for feed/process, under staging directory, which usually contains the configuration xml's, logs file and jars files etc. required for executing the feed/process entity.

avatar
Master Mentor

Thanks @peeyush

avatar
Expert Contributor

@Pavel Benes

The /apps/falcon-MiddleGate/staging/ and /apps/falcon-MiddleGate/working dirs are created when the cluster entity is submitted by the user. These dirs are used to store Falcon specific information, staging dir should have permissions of 777 and working dirs should have permissions of 755. Falcon expects that in real usecases, a falcon cluster entity is created by the admin and feed/process entities are created by the users of the cluster.

1. why the permission for the 'feed' folder is now 'drwxr-xr-x' ? -- Falcon creates <staging_dir>/falcon/workflows/feed and <staging_dir>/falcon/workflows/process only when a feed/process entity are scheduled. The owner of these dirs is the user scheduling the entity. The permissions are based on the default umask of the FS.

2. I am inclined to agree with you. <staging_dir>/falcon/workflows/process and <staging_dir>/falcon/workflows/feed should be created when cluster entity is submitted, and the ownership should belong to falcon, with perms 777. I created a Jira https://issues.apache.org/jira/browse/FALCON-1647 and I will update/resolve it after discussing with Falcon community.

The temporary workaround for this problem is to manually change the permissions of all dirs upto <staging_dir>/falcon/workflows/process and <staging_dir>/falcon/workflows/feed to 777 as @peeyush suggested.

avatar
Rising Star

@Balu

Thanks for filing the issue. I understand that the immediate cause of the failure are unsufficient hdfs permissions for the 'feed' folder. However I am puzzled about what triggered this. We were using the same Falcon instalation (both 'kefi' and 'middlegate_test1' users) for several weeks without problems. At the same time we have experienced problem with cluster/YARN overload since there were some processes running with minute(1) frequency. But I am not sure whether this could be related.

avatar
Expert Contributor

My understanding is that cluster/YARN overload and this issue are not related.