We are running an oozie flow.
All steps that are hive/impala run fine. One step in the oozie flow is a shell script that launches the hadoop streaming jar. We have run the exact same code on an old cluster (cdh 5.12) and this worked fine. Also everything related to this script runs fine locally. However when ran through oozie it fails. We are using cdh6.2.
The job logs show following error:
Error launching job , Invalid job conf : cache file (mapreduce.job.cache.files) scheme: "hdfs" host: "poccluster" port: -1 file: "/thefilepath" conflicts with cache file (mapreduce.job.cache.files) /user/yarn/.staging/job_1558348093683_0051/files/run_it_parallel.sh
Streaming Command Failed!
We have tried clearing the user/app cache in the yarn directories of our users. There is no configuration parameter mapreduce.job.cache.files. The /user/yarn/.staging directory is empty.
The container logs show following error:
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.oozie.action.hadoop.LauncherAM.runActionMain(LauncherAM.java:410)
at org.apache.oozie.action.hadoop.LauncherAM.access$300(LauncherAM.java:55)
at org.apache.oozie.action.hadoop.LauncherAM$2.run(LauncherAM.java:223)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.oozie.action.hadoop.LauncherAM.run(LauncherAM.java:217)
at org.apache.oozie.action.hadoop.LauncherAM$1.run(LauncherAM.java:153)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1875)
at org.apache.oozie.action.hadoop.LauncherAM.main(LauncherAM.java:141)
Caused by: org.apache.oozie.action.hadoop.LauncherMainException
at org.apache.oozie.action.hadoop.ShellMain.run(ShellMain.java:76)
at org.apache.oozie.action.hadoop.LauncherMain.run(LauncherMain.java:104)
at org.apache.oozie.action.hadoop.ShellMain.main(ShellMain.java:63)
... 16 more
This looks like it is related to access rights. We do not have kerberos and sentry is disabled for all services.
We are quite sure that this process does not have an error on our side (have used it on different cluster and it runs on a worker node locally without issues). Possibly the fact that when we run it locally, we are using our own user (and not the users used by oozie/yarn) is related to why we get this error.
Any input, maybe especially on the first error from the job logs, would be hugely appreciated.
Thanks!