Member since
02-24-2015
9
Posts
0
Kudos Received
0
Solutions
02-27-2015
06:28 AM
Thank you all! I have re-set the env variables in crontab as you suggested. It seems to work fine!
... View more
02-26-2015
01:26 AM
In each of the 2 cases I use the same user (my user name). To define the scheduling of the crontab job I use "crontab -e" under my user.
... View more
02-26-2015
12:45 AM
Thank you Sowen for the reply but actually I was saying that the hadoop classpath & other is missing only when the script is launched by crontab. I have no problems when I launch the script manually.
... View more
02-25-2015
09:27 AM
I have written a Spark application in python and successfully tested it. I run it with spark-submit in command line. Everything seemes to work fine and I get the expected output. The problem is, when I try to schedule my application through crontab, to run every 5 minutes, it fails with the following error: /u01/cloudera/parcels/CDH-5.1.3-1.cdh5.1.3.p0.12/lib/spark/bin/compute-classpath.sh: line 64: hadoop: command not found Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/fs/FSDataInputStream at org.apache.spark.deploy.SparkSubmitArguments.parse$1(SparkSubmitArguments.scala:313) at org.apache.spark.deploy.SparkSubmitArguments.parseOpts(SparkSubmitArguments.scala:207) at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:59) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:50) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.FSDataInputStream at java.net.URLClassLoader$1.run(URLClassLoader.java:366) at java.net.URLClassLoader$1.run(URLClassLoader.java:355) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:354) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) ... 5 more It looks to me that crontab is not able to load the environment variables where I store all the paths to the jars (the hadoop classpath is missing when the script is launched by crontab). Did anyone encountered this issue? I tried some of these solutions: http://unix.stackexchange.com/questions/27289/how-can-i-run-a-cron-command-with-existing-environmental-variables
... View more
Labels:
- Labels:
-
Apache Spark