Support Questions

Find answers, ask questions, and share your expertise

How to run the SparkWordCount.scala file in hdp 2.3? Please help.

avatar
Contributor
 
1 ACCEPTED SOLUTION

avatar
Super Guru

Try to execute below command and once you get the spark sc context shell prompt then run sparkwordcount.scala on it.

bash# spark-shell --master yarn-client

View solution in original post

9 REPLIES 9

avatar
Super Guru

Try to execute below command and once you get the spark sc context shell prompt then run sparkwordcount.scala on it.

bash# spark-shell --master yarn-client

avatar
Contributor

Do i need to compile it or spark-submit the scala file?

avatar
Super Guru

@AKILA VEL

you can run the .scala file directly on spark-shell .

:load PATH_TO_FILE

avatar
Super Guru

@AKILA VEL

Here is a sample program.

import org.apache.spark._
import org.apache.spark.SparkContext._
object WordCount {
    def main(args: Array[String]) {  
      val conf = new SparkConf().setAppName("wordCount")
      // Create a Scala Spark Context.
      val sc = new SparkContext(conf)
      // Load our input data.
      val input =  sc.textFile("/user/test/input/data.txt")
      // Split up into words.
      val words = input.flatMap(line => line.split(" "))
      // Transform into word and count.
      val counts = words.map(word => (word, 1)).reduceByKey{case (x, y) => x + y}
      // Save the word count back out to a text file, causing evaluation.
      counts.saveAsTextFile("/user/test/output")
    }
}

spark-shell --master yarn-client
scala>:load <file path>
scala>WordCount.main(null)

avatar
Contributor

Thanks a lot. It worked.

avatar
Contributor

can you pls explain how to make jar file out of it.

avatar
Super Guru

@AKILA VEL

Thanks for confirming, can you please click on accept button on my answer?

Regarding build jar, I have provided my answer on your another question here

https://community.hortonworks.com/questions/28962/how-to-create-jar-file-from-spark-scala-file.html#...

avatar

you need to compile it. spark-submit wants the jar file, see http://spark.apache.org/docs/latest/quick-start.html#self-contained-applications

avatar

Hi @AKILA VEL,

Please check this tutorial on how you can do a wordcount with Spark on HDP 2.3: http://fr.hortonworks.com/hadoop-tutorial/a-lap-around-apache-spark/

Section 1 shows how to upgrade Spark to 1.6 version. You can ignore it and go directly to section 2.

I hope this will help you.