- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
How to run the SparkWordCount.scala file in hdp 2.3? Please help.
- Labels:
-
Apache Spark
Created 04-22-2016 02:10 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created 04-22-2016 02:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try to execute below command and once you get the spark sc context shell prompt then run sparkwordcount.scala on it.
bash# spark-shell --master yarn-client
Created 04-22-2016 02:15 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Try to execute below command and once you get the spark sc context shell prompt then run sparkwordcount.scala on it.
bash# spark-shell --master yarn-client
Created 04-22-2016 02:28 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Do i need to compile it or spark-submit the scala file?
Created 04-22-2016 02:35 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Created 04-22-2016 02:56 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Here is a sample program.
import org.apache.spark._ import org.apache.spark.SparkContext._ object WordCount { def main(args: Array[String]) { val conf = new SparkConf().setAppName("wordCount") // Create a Scala Spark Context. val sc = new SparkContext(conf) // Load our input data. val input = sc.textFile("/user/test/input/data.txt") // Split up into words. val words = input.flatMap(line => line.split(" ")) // Transform into word and count. val counts = words.map(word => (word, 1)).reduceByKey{case (x, y) => x + y} // Save the word count back out to a text file, causing evaluation. counts.saveAsTextFile("/user/test/output") } }
spark-shell --master yarn-client scala>:load <file path> scala>WordCount.main(null)
Created 04-22-2016 04:53 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks a lot. It worked.
Created 04-22-2016 04:55 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
can you pls explain how to make jar file out of it.
Created 04-22-2016 07:03 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for confirming, can you please click on accept button on my answer?
Regarding build jar, I have provided my answer on your another question here
Created 04-22-2016 03:02 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
you need to compile it. spark-submit wants the jar file, see http://spark.apache.org/docs/latest/quick-start.html#self-contained-applications
Created 04-22-2016 05:05 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hi @AKILA VEL,
Please check this tutorial on how you can do a wordcount with Spark on HDP 2.3: http://fr.hortonworks.com/hadoop-tutorial/a-lap-around-apache-spark/
Section 1 shows how to upgrade Spark to 1.6 version. You can ignore it and go directly to section 2.
I hope this will help you.