Reply
Highlighted
Explorer
Posts: 29
Registered: ‎03-28-2017

Exception in passing jdbc connection properties to spark object

[ Edited ]

I have recently completed studying Scala & Spark. I am trying to do an exercise to read data from a table present on Postgres DB using JDBC connection. I created a Scala SBT Project and created a properties file to store all the connection properties.DjfJZ

 

 

I have the following properties in connections.properties file:

devHost=xx.xxx.xxx.xxx
devPort=xxxx
devDbName=base
devUserName=username
devPassword=password
gpDriverClass=org.postgresql.Driver

I created a DBManager class where I initialize the connection properties:

import java.io.FileInputStream
import java.util.Properties

class DBManager {
  val dbProps = new Properties()
  val connectionProperties = new Properties()
  dbProps.load(new FileInputStream(connections.properties))

  val jdbcDevHostname = dbProps.getProperty("devHost")
  val jdbcDevPort     = dbProps.getProperty("devPort")
  val jdbcDevDatabase = dbProps.getProperty("devDbName")
  val jdbcDevUrl      = s"jdbc:postgresql://${jdbcDevHostname}:${jdbcDevPort}/${jdbcDevDatabase}?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory" + s",${uname},${pwd}"

  connectionProperties.setProperty("Driver",dbProps.getProperty("gpDriverClass"))
  connectionProperties.put("user", dbProps.getProperty("devUserName"))
  connectionProperties.put("password", dbProps.getProperty("devPassword"))
}

In a Scala Object I am trying to use all these details as below:

import org.apache.spark.sql.SparkSession
import com.gphive.connections.DBManager

object PartitionRetrieval {
  def main(args: Array[String]): Unit = {
    val dBManager = new DBManager();
    val spark = SparkSession.builder().enableHiveSupport().appName("GP_YEARLY_DATA").getOrCreate()
    val tabData = spark.read.jdbc(dBManager.jdbcDevUrl,"tableName",connectionProperties)

  }
}

I referred this link to create the above code. When I execute the code, it says version mismatches and exceptions while loading the property file into:

 

[warn] Found version conflict(s) in library dependencies; some are suspected to be binary incompatible:
[warn]     * io.netty:netty:3.9.9.Final is selected over {3.6.2.Final, 3.7.0.Final}
[warn]         +- org.apache.spark:spark-core_2.11:2.2.0             (depends on 3.9.9.Final)
[warn]         +- org.apache.zookeeper:zookeeper:3.4.6               (depends on 3.6.2.Final)
[warn]         +- org.apache.hadoop:hadoop-hdfs:2.6.5                (depends on 3.6.2.Final)
[warn]     * commons-net:commons-net:2.2 is selected over 3.1
[warn]         +- org.apache.spark:spark-core_2.11:2.2.0             (depends on 2.2)
[warn]         +- org.apache.hadoop:hadoop-common:2.6.5              (depends on 3.1)
[warn]     * com.google.guava:guava:11.0.2 is selected over {12.0.1, 16.0.1}
[warn]         +- org.apache.hadoop:hadoop-yarn-client:2.6.5         (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-yarn-api:2.6.5            (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-yarn-common:2.6.5         (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-yarn-server-nodemanager:2.6.5 (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-yarn-server-common:2.6.5  (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-hdfs:2.6.5                (depends on 11.0.2)
[warn]         +- org.apache.curator:curator-framework:2.6.0         (depends on 16.0.1)
[warn]         +- org.apache.curator:curator-client:2.6.0            (depends on 16.0.1)
[warn]         +- org.apache.curator:curator-recipes:2.6.0           (depends on 16.0.1)
[warn]         +- org.apache.hadoop:hadoop-common:2.6.5              (depends on 16.0.1)
[warn]         +- org.htrace:htrace-core:3.0.4                       (depends on 12.0.1)
[warn] Run 'evicted' to see detailed eviction warnings
[info] Compiling 2 Scala sources to C:\YearPartition\target\scala-2.11\classes ...
[error] C:\YearPartition\src\main\scala\com\gphive\connections\DBManager.scala:14:166: not found: value uname
[error]   val jdbcDevUrl      = s"jdbc:postgresql://${jdbcDevHostname}:${jdbcDevPort}/${jdbcDevDatabase}?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory" + s",${uname},${pwd}"
[error]                                                                                                                                                                      ^
[error] C:\YearPartition\src\main\scala\com\gphive\connections\DBManager.scala:14:175: not found: value pwd
[error]   val jdbcDevUrl      = s"jdbc:postgresql://${jdbcDevHostname}:${jdbcDevPort}/${jdbcDevDatabase}?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory" + s",${uname},${pwd}"
[error]                                                                                                                                                                               ^
[error] C:\YearPartition\src\main\scala\com\gphive\connections\DBManager.scala:9:36: not found: value connections
[error]   dbProps.load(new FileInputStream(connections.properties))
[error]                                    ^
[error] C:\YearPartition\src\main\scala\com\yearpartition\obj\PartitionRetrieval.scala:10:68: not found: value connectionProperties
[error]     val tabData = spark.read.jdbc(dBManager.jdbcDevUrl,"tableName",connectionProperties)
[error]                                                                    ^
[error] four errors found
[error] (Compile / compileIncremental) Compilation failed
[error] Total time: 45 s, completed Jul 19, 2018 6:06:25 PM

Process finished with exit code 1



This is my build.sbt file:

name := "YearPartition"

version := "0.1"

scalaVersion := "2.11.8"

// https://mvnrepository.com/artifact/org.apache.spark/spark-core
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0"

// https://mvnrepository.com/artifact/org.apache.spark/spark-sql
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.0"

Could anyone let me know how can I fix the version miss match error and load the properties file correctly.

Announcements