Support Questions

Find answers, ask questions, and share your expertise
Announcements
Celebrating as our community reaches 100,000 members! Thank you!

How to run the SparkWordCount.scala file in hdp 2.3? Please help.

avatar
Contributor
 
1 ACCEPTED SOLUTION

avatar
Super Guru

Try to execute below command and once you get the spark sc context shell prompt then run sparkwordcount.scala on it.

bash# spark-shell --master yarn-client

View solution in original post

9 REPLIES 9

avatar
Super Guru

Try to execute below command and once you get the spark sc context shell prompt then run sparkwordcount.scala on it.

bash# spark-shell --master yarn-client

avatar
Contributor

Do i need to compile it or spark-submit the scala file?

avatar
Super Guru

@AKILA VEL

you can run the .scala file directly on spark-shell .

:load PATH_TO_FILE

avatar
Super Guru

@AKILA VEL

Here is a sample program.

import org.apache.spark._
import org.apache.spark.SparkContext._
object WordCount {
    def main(args: Array[String]) {  
      val conf = new SparkConf().setAppName("wordCount")
      // Create a Scala Spark Context.
      val sc = new SparkContext(conf)
      // Load our input data.
      val input =  sc.textFile("/user/test/input/data.txt")
      // Split up into words.
      val words = input.flatMap(line => line.split(" "))
      // Transform into word and count.
      val counts = words.map(word => (word, 1)).reduceByKey{case (x, y) => x + y}
      // Save the word count back out to a text file, causing evaluation.
      counts.saveAsTextFile("/user/test/output")
    }
}

spark-shell --master yarn-client
scala>:load <file path>
scala>WordCount.main(null)

avatar
Contributor

Thanks a lot. It worked.

avatar
Contributor

can you pls explain how to make jar file out of it.

avatar
Super Guru

@AKILA VEL

Thanks for confirming, can you please click on accept button on my answer?

Regarding build jar, I have provided my answer on your another question here

https://community.hortonworks.com/questions/28962/how-to-create-jar-file-from-spark-scala-file.html#...

avatar

you need to compile it. spark-submit wants the jar file, see http://spark.apache.org/docs/latest/quick-start.html#self-contained-applications

avatar

Hi @AKILA VEL,

Please check this tutorial on how you can do a wordcount with Spark on HDP 2.3: http://fr.hortonworks.com/hadoop-tutorial/a-lap-around-apache-spark/

Section 1 shows how to upgrade Spark to 1.6 version. You can ignore it and go directly to section 2.

I hope this will help you.