Support Questions
Find answers, ask questions, and share your expertise

How to run the SparkWordCount.scala file in hdp 2.3? Please help.

Solved Go to solution

How to run the SparkWordCount.scala file in hdp 2.3? Please help.

Explorer
 
1 ACCEPTED SOLUTION

Accepted Solutions

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

Try to execute below command and once you get the spark sc context shell prompt then run sparkwordcount.scala on it.

bash# spark-shell --master yarn-client

View solution in original post

9 REPLIES 9

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

Try to execute below command and once you get the spark sc context shell prompt then run sparkwordcount.scala on it.

bash# spark-shell --master yarn-client

View solution in original post

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

Explorer

Do i need to compile it or spark-submit the scala file?

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

@AKILA VEL

you can run the .scala file directly on spark-shell .

:load PATH_TO_FILE

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

@AKILA VEL

Here is a sample program.

import org.apache.spark._
import org.apache.spark.SparkContext._
object WordCount {
    def main(args: Array[String]) {  
      val conf = new SparkConf().setAppName("wordCount")
      // Create a Scala Spark Context.
      val sc = new SparkContext(conf)
      // Load our input data.
      val input =  sc.textFile("/user/test/input/data.txt")
      // Split up into words.
      val words = input.flatMap(line => line.split(" "))
      // Transform into word and count.
      val counts = words.map(word => (word, 1)).reduceByKey{case (x, y) => x + y}
      // Save the word count back out to a text file, causing evaluation.
      counts.saveAsTextFile("/user/test/output")
    }
}

spark-shell --master yarn-client
scala>:load <file path>
scala>WordCount.main(null)

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

Explorer

Thanks a lot. It worked.

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

Explorer

can you pls explain how to make jar file out of it.

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

@AKILA VEL

Thanks for confirming, can you please click on accept button on my answer?

Regarding build jar, I have provided my answer on your another question here

https://community.hortonworks.com/questions/28962/how-to-create-jar-file-from-spark-scala-file.html#...

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

you need to compile it. spark-submit wants the jar file, see http://spark.apache.org/docs/latest/quick-start.html#self-contained-applications

Re: How to run the SparkWordCount.scala file in hdp 2.3? Please help.

Hi @AKILA VEL,

Please check this tutorial on how you can do a wordcount with Spark on HDP 2.3: http://fr.hortonworks.com/hadoop-tutorial/a-lap-around-apache-spark/

Section 1 shows how to upgrade Spark to 1.6 version. You can ignore it and go directly to section 2.

I hope this will help you.