Cloudera Employee

I have been playing quite a bit with CDSW lately. Here is a quick article on how to setup a CDSW project in scala connecting to an external RDBMS

Step 1: Create a new CDSW Project

Using the CDSW UI, create a new Scala Project:


Step 2: Reference the external Jar in your spark-defaults.conf

Open your project, and edit your spark-defaults.conf to add an external jar:



Step 3: Create a simple Scala file to connect to the DB

Create a new file and add this code in it:

val sqlContext = new org.apache.spark.sql.SQLContext(sc)

val df ="jdbc").option("url", "jdbc:mysql://[YOUR_SERVER_IP]:3306/[YOUR_DB]").option("driver", "com.mysql.jdbc.Driver").option("dbtable", "[YOUR_TABLE]").option("user", "[YOUR_USER]").option("password", "[YOUR_PWD]").load()

Step 4: Run your application

Launch a session and run your code: