02-25-2015 09:27 AM
I have written a Spark application in python and successfully tested it. I run it with spark-submit in command line.
Everything seemes to work fine and I get the expected output.
The problem is, when I try to schedule my application through crontab, to run every 5 minutes, it fails with the following error:
/u01/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/bin/compute-classpath.sh: line 64: hadoop: command not found
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream
at java.security.AccessController.doPrivileged(Native Method)
... 5 more
It looks to me that crontab is not able to load the environment variables where I store all the paths to the jars (the hadoop classpath is missing when the script is launched by crontab). Did anyone encountered this issue? I tried some of these solutions: http://unix.stackexchange.com/questions/27289/how-can-i-run-a-cron-command-with-existing-environment...
02-26-2015 12:45 AM
Thank you Sowen for the reply but actually I was saying that the hadoop classpath & other is missing only when the script is launched by crontab. I have no problems when I launch the script manually.
02-26-2015 01:28 AM
Is some of the environment setup only happening in your shell config that is triggered for interactive shells?
The problem is fairly clear -- env not setup, and the question is why, but it's not really a Spark issue per se.
02-26-2015 11:52 PM
to the top of a shell-script that submits your Spark job. Don't have a VM with Spark / Hadoop handy right now, but IIRC that's what I've needed to do in the past.
11-21-2015 05:38 AM
I am trying to schedule a spark job using cron.
I have made a shell script and it executes well on the terminal.
However, when I execute the script using cron it gives me insufficient memory to start JVM thread error.
Every time I start the script using terminal there is no issue. This issue comes when the script starts with cron.
Kindly if you could suggest something.