Community Articles

Find and share helpful community-sourced technical articles.


* SAP HANA - Instructions to setup a Cloud HANA on AWS or Azure

* HDP 2.5.x

We will use Spark shell, scala code and data frames to access HANA using JDBC driver.

Start the spark shell with ngdbc.jar driver.

spark-shell --master yarn-client --jars /tmp/ngdbc.jar
scala> val url="jdbc:sap://xxxx:30015/?currentschema=CODEJAMMER"
url: String = jdbc:sap://xxxx:30015/?currentschema=CODEJAMMER
     | val prop = new java.util.Properties
prop: java.util.Properties = {}
scala> prop.setProperty("user","xxxx")
res1: Object = null
scala> prop.setProperty("password","xxxx")
res2: Object = null
scala> prop.setProperty("driver","")
res3: Object = null
scala> val emp_address =,"EMPLOYEE_ADDRESS",prop)
emp_address: org.apache.spark.sql.DataFrame = [ID: bigint, STREETNUMBER: int, STREET: string, LOCALITY: string, STATE: string, COUNTRY: string]
17/02/02 20:17:19 INFO SparkContext: Starting job: show at <console>:32
17/02/02 20:17:23 INFO DAGScheduler: Job 0 finished: show at <console>:32, took 4.586219 s
|  1|         555|    Madison Ave|New York|   NY|America|
|  2|          95|  Morten Street|New York|   NY|    USA|
|  3|        2395|Broadway Street|New York|   NY|    USA|


If you see this error:

org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: 

The issue is resolved with the latest SPS12+ driver. I had to upgrade my sap driver.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.
Version history
Last update:
‎02-02-2017 08:31 PM
Updated by:
Top Kudoed Authors