So, should he be passing in file:// in his arg? Does it default to a local file?
Without a scheme, it should be treated as a local file, yes. It seems like the file is distributed just fine; there's a potential permissions or visibility issue somewhere though.
I'm curious, what are your imports for the sample code you provided? I can find "Paths" in the documentation?
To remove R from the equation, I tried it with a simple bash script; same behavior (i.e. works with local, fails on yarn-client with either "No such file or directory" or "Permission denied", depending on the node).
I also tried adding the scheme "file://" to the beginning, again same behavior.
I'm with Scott, I think it looks like a permission issue, but I was not involved with setting up permissions so not certain. We'll have Scott look at it offline, thanks everyone.
Hi, I'm wondering if you all determined the root cause and/or a solution to this. I'm having the same problem myself. Thanks!
No, I never did. Eventually we worked around the problem by just installing the script we wanted to invoke on every node in the cluster and adding it to the path using Ansible (you could probably also achieve this with Puppet or Chef). This allowed us to achieve our goal, without having to rely on Spark distributing the script file. But it is not ideal obviously, since now we have an extra deployment step to update the script on all the nodes whenever it changes. I'm still not too happy, but I've moved on.