Community Articles

Find and share helpful community-sourced technical articles.
Announcements
Celebrating as our community reaches 100,000 members! Thank you!
Labels (1)
avatar

Objective

Using correct HDP repositories is a requirement when building Spark production applications that run on HDP. Hence I decided to create this article to help those creating new spark applications using IntelliJ which don't know how to reference the Hortornworks repositories instead of the default ones.

How-To

Following video goes step by step on how to create a simple spark application using the Hortonworks repositories. I will share the content of build.sbt and Hello scala class bellow.

The build.sbt

name := "sparkTest"

version := "0.1"

scalaVersion := "2.11.12"

libraryDependencies += "org.apache.spark" %% "spark-core" % "2.1.1.2.6.1.0-129"

resolvers := List("Hortonworks Releases" at "http://repo.hortonworks.com/content/repositories/releases/", "Jetty Releases" at "http://repo.hortonworks.com/content/repositories/jetty-hadoop/")


Important Note: We will use the HDP 2.6.1 Spark 2.1.1 dependencies to build the project. If you are running different HDP version you need to check and correct the dependency to match the correct version being used. Also you should check which is the correct scala you should use in your project. For spark 2.1.1 the correct scala version is 2.11.x.


The Hello scala class

  1. Create a new package called example
  2. Create a new scala class called Hello inside the example package with following content:
package example
import org.apache.spark.{SparkConf, SparkContext}

object Hello extends Greeting with App {
  val conf = new SparkConf().setAppName(appName)
  val sc = new SparkContext(conf)

  println(greeting)
  println(sc.version)

}
trait Greeting {
  lazy val appName = "Hello World Spark App"
  lazy val greeting: String = "hello"
}
3,582 Views