- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Job fails after installing CDP 7.1.6
Created ‎08-27-2021 06:30 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We have a job that was running fine in 7.1.4, but is now failing after we went to 7.1.6. It fails with permission issues writing to a directory in /user/yarn. The job runs as a different user, and we are wondering why it started writing what looks like temporary output to a directory under /user/yarn, that the user running job doesn't have access to?
Created ‎09-24-2021 06:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As it turns out, it seems like it was a combination of things that caused this job to fail. First, we installed a minor java version update: went from jdk8u242-b08 to openjdk-8u292-b10. Second, the developer changed the way the files were being written from asynchronous to synchronous. They were using the CompletableFuture.runAsync class before, and took that out to use just normal writes.
Created ‎08-28-2021 01:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@hadoclc
Can you share more details? What job spark/hive?
Can you share some information about your environment and the code submitted that fails?
What is the permission of /user/yarn
Who and how was the job executed in 7.1.4? Is the same user running the job in 7.1.6?
Please share the logs?
Created ‎09-15-2021 08:01 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Sorry... I've been working with cloudera support - no real answer yet. Anyway, it's a spark job, and we have proven it can work with 7.1.6 in 2 different clusters, so we have 2 that work and 2 that don't. All running CDP 7.1.6. Here's the perms for the /user/yarn directory on the cluster that doesn't work:
varneyg@hdadmgw01mxm1:~$ hdfs dfs -ls -d /user/yarn
drwxr-xr-x - yarn yarn 0 2021-08-25 14:08 /user/yarn
varneyg@hdadmgw01mxm1:~$ hdfs dfs -ls /user/yarn
Found 3 items
drwx------ - yarn yarn 0 2021-09-15 10:12 /user/yarn/.staging
drwxr-xr-x - yarn yarn 0 2021-09-13 10:32 /user/yarn/google-links
drwxr-xr-x - hdfs yarn 0 2021-02-21 13:15 /user/yarn/mapreduce
The job runs as a "googleops" user - same code running in both 7.1.4 and 7.1.6 I will attach the part of the job that does the writes, and job job logs.
Created ‎09-15-2021 08:05 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
logs and code attached
Created ‎09-15-2021 07:17 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi,
As per the application log it seems to be the issue with permissions, Have you tried to run with other users? or have you tried to give access to this user to access the path?
Check the permissions with other users who have access to this path and try to apply the permission to "googleops" user.
{"Event":"SparkListenerTaskEnd","Stage ID":11,"Stage Attempt ID":0,"Task Type":"ResultTask","Task End Reason":{"Reason":"ExceptionFailure","Class Name":"org.apache.hadoop.security.AccessControlException","Description":"Permission denied: user=googleops, access=WRITE, inode=\"/user/yarn/google-links/google-book-feed/_temporary/0\":yarn:yarn:drwxr-xr-x\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:504)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:336)\n\tat org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:242)\n\tat
Thank you,
Chethan YM
Created ‎09-24-2021 06:32 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
As it turns out, it seems like it was a combination of things that caused this job to fail. First, we installed a minor java version update: went from jdk8u242-b08 to openjdk-8u292-b10. Second, the developer changed the way the files were being written from asynchronous to synchronous. They were using the CompletableFuture.runAsync class before, and took that out to use just normal writes.
