I've just installed spark (using parcels), on CDH4.5. I've managed to run the spark-shell and completed the basic tutorial exercises.
Now I'm trying to run some examples, however, I'm facing some difficulties.
I found a spark-examples jar, but I couldn't find a run-example script, hence I went to the github resource and download the script and place it in "/opt/cloudera/parcels/SPARK/lib/spark/bin"
from thiis folder:/opt/cloudera/parcels/SPARK/lib/spark
I ran this command:
> bin/run-example -cp examples/lib/spark-examples_2.10-0.9.0-cdh4.6.0-SNAPSHOT.jar org.apache.spark.streaming.examples.NetworkWordCount local localhost 9999
And I get this error:
/opt/cloudera/parcels/SPARK/lib/spark/conf/spark-env.sh: line 30: cmaster.hdc: command not found
Failed to find Spark examples assembly in /opt/cloudera/parcels/SPARK/lib/spark/examples/target
You need to build Spark with sbt/sbt assembly before running this program
Anyone can provide some hints on where I got it wrong? And advice me how can I run the examples?
Many thanks. BTW, I'm a newbie in SCALA.
Other folks here may know better than I, but:
Try setting the env variable SPARK_EXAMPLES_JAR to the location of the examples jar. By default it's assuming you're running from the Spark project directory.
You can also check out the whole Spark 0.9.0 source from GitHub and build with sbt as the message says. I imagine that would work as well.
But first perhaps just try just typing code in the Spark / Scala shell that invokes the example classes you see in the project. They should all already be on the classpath that gets set up. Just type `import org.apache.spark.examples._` first for example and then start using the classes.
Spark needs sbt also.
Not sure if you have run the sbt setup. If not so please-
go to section "A Note About Hadoop Versions". Here make sure you place right Hadoop version as CDH has appended some tax over the Hadoop version.
some thing like "2.0.0-mr1-cdh4.1.1".
Plz go to $Spark/sbt and run SPARK_HADOOP_VERSION=2.0.0-mr1-cdh4.1.1 sbt/sbt assembly
[abhi@localhost sbt]$ pwd
[abhi@localhost sbt] SPARK_HADOOP_VERSION=2.2.0 sbt/sbt assembly
Make sure iyou have open internet access.
Once you done with it, it's time to run the example.
Please note that there's NO run-example command inside Spark/bin directory.
So please go Spark home, there you will find some commands like run-exampleXXX.
1. Open terminal and type [abhi@localhost home]$ nc -lk 9999
2. open another terminal and go to Spark home.
./run-example org.apache.spark.streaming.examples.NetworkWordCount local localhost 9999
Hopefully it will resolve your issue.