We’ve experienced an error in Nifi that I can only think is a bug. While we were running a Hive query in a processor group that is read/write only for Group A, using a Hive controller service configured with Group A’s specific hive user id, the processor failed due to insufficient permissions to access the content for the query on HDFS with user=Group B’s hive controller service id, which is configured in Group B’s processor group, which is only accessible by Group B. So we are seeing a conflict of configurations between two completely separate processor groups.
Here is a snippet from the error in the logs
org.apache.hadoop.security.AccessControlException: Permission denied: user=group_b_id, access=EXECUTE, inode="/user/ group_a_id /tech/poc /out_table": group_a_id:group_a_group:drwxr-x---
The point of contention is user=group_b_id. Nowhere in our processor group do we reference that user id. It would only be used over in a completely different processor group. The act of disabling and then enabling the controller service temporarily fixed the issue and we cannot recreate it.
Have there been other documented cases of controller service configurations for similar components interfering with one another?
When you say that disabling/enabling the controller service (CS) fixed the issue, were there any other errors from group_a_id for example? What if you started with new controller services for both and tried the reverse (B then A), do you see the error there? I ask to try to determine if there is some configuration caching going on, at first glance at the NiFi code it doesn't appear that it should interfere, I am curious if this happens at the NiFi processor level or some underlying Hive level.
I couldn't reproduce this, but I'm not confident I had the right reproduction environment. I created two users with different groups, a HiveConnectionPool for each user, and two separate PutHiveQL processors, and did not get that error. I can't remember now if they were in two different processor groups, but this error is on the Hadoop side so that should not matter.
Sorry I didn't respond sooner, I haven't been able to reproduce this yet either, and haven't seen it since. The scenario did have two HiveConnectionPools to the same Hive instance, with different users/dbs in different processor groups. The error itself was on the hadoop side, but the error was thrown because of permissions of the user it was operating as which would have been a function of some sort of interference between the controller services.