Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Exception in passing jdbc connection properties to spark object

Exception in passing jdbc connection properties to spark object

Contributor

I have recently completed studying Scala & Spark. I am trying to do an exercise to read data from a table present on Postgres DB using JDBC connection. I created a Scala SBT Project and created a properties file to store all the connection properties.DjfJZ

 

 

I have the following properties in connections.properties file:

devHost=xx.xxx.xxx.xxx
devPort=xxxx
devDbName=base
devUserName=username
devPassword=password
gpDriverClass=org.postgresql.Driver

I created a DBManager class where I initialize the connection properties:

import java.io.FileInputStream
import java.util.Properties

class DBManager {
  val dbProps = new Properties()
  val connectionProperties = new Properties()
  dbProps.load(new FileInputStream(connections.properties))

  val jdbcDevHostname = dbProps.getProperty("devHost")
  val jdbcDevPort     = dbProps.getProperty("devPort")
  val jdbcDevDatabase = dbProps.getProperty("devDbName")
  val jdbcDevUrl      = s"jdbc:postgresql://${jdbcDevHostname}:${jdbcDevPort}/${jdbcDevDatabase}?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory" + s",${uname},${pwd}"

  connectionProperties.setProperty("Driver",dbProps.getProperty("gpDriverClass"))
  connectionProperties.put("user", dbProps.getProperty("devUserName"))
  connectionProperties.put("password", dbProps.getProperty("devPassword"))
}

In a Scala Object I am trying to use all these details as below:

import org.apache.spark.sql.SparkSession
import com.gphive.connections.DBManager

object PartitionRetrieval {
  def main(args: Array[String]): Unit = {
    val dBManager = new DBManager();
    val spark = SparkSession.builder().enableHiveSupport().appName("GP_YEARLY_DATA").getOrCreate()
    val tabData = spark.read.jdbc(dBManager.jdbcDevUrl,"tableName",connectionProperties)

  }
}

I referred this link to create the above code. When I execute the code, it says version mismatches and exceptions while loading the property file into:

 

[warn] Found version conflict(s) in library dependencies; some are suspected to be binary incompatible:
[warn]     * io.netty:netty:3.9.9.Final is selected over {3.6.2.Final, 3.7.0.Final}
[warn]         +- org.apache.spark:spark-core_2.11:2.2.0             (depends on 3.9.9.Final)
[warn]         +- org.apache.zookeeper:zookeeper:3.4.6               (depends on 3.6.2.Final)
[warn]         +- org.apache.hadoop:hadoop-hdfs:2.6.5                (depends on 3.6.2.Final)
[warn]     * commons-net:commons-net:2.2 is selected over 3.1
[warn]         +- org.apache.spark:spark-core_2.11:2.2.0             (depends on 2.2)
[warn]         +- org.apache.hadoop:hadoop-common:2.6.5              (depends on 3.1)
[warn]     * com.google.guava:guava:11.0.2 is selected over {12.0.1, 16.0.1}
[warn]         +- org.apache.hadoop:hadoop-yarn-client:2.6.5         (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-yarn-api:2.6.5            (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-yarn-common:2.6.5         (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-yarn-server-nodemanager:2.6.5 (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-yarn-server-common:2.6.5  (depends on 11.0.2)
[warn]         +- org.apache.hadoop:hadoop-hdfs:2.6.5                (depends on 11.0.2)
[warn]         +- org.apache.curator:curator-framework:2.6.0         (depends on 16.0.1)
[warn]         +- org.apache.curator:curator-client:2.6.0            (depends on 16.0.1)
[warn]         +- org.apache.curator:curator-recipes:2.6.0           (depends on 16.0.1)
[warn]         +- org.apache.hadoop:hadoop-common:2.6.5              (depends on 16.0.1)
[warn]         +- org.htrace:htrace-core:3.0.4                       (depends on 12.0.1)
[warn] Run 'evicted' to see detailed eviction warnings
[info] Compiling 2 Scala sources to C:\YearPartition\target\scala-2.11\classes ...
[error] C:\YearPartition\src\main\scala\com\gphive\connections\DBManager.scala:14:166: not found: value uname
[error]   val jdbcDevUrl      = s"jdbc:postgresql://${jdbcDevHostname}:${jdbcDevPort}/${jdbcDevDatabase}?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory" + s",${uname},${pwd}"
[error]                                                                                                                                                                      ^
[error] C:\YearPartition\src\main\scala\com\gphive\connections\DBManager.scala:14:175: not found: value pwd
[error]   val jdbcDevUrl      = s"jdbc:postgresql://${jdbcDevHostname}:${jdbcDevPort}/${jdbcDevDatabase}?ssl=true&sslfactory=org.postgresql.ssl.NonValidatingFactory" + s",${uname},${pwd}"
[error]                                                                                                                                                                               ^
[error] C:\YearPartition\src\main\scala\com\gphive\connections\DBManager.scala:9:36: not found: value connections
[error]   dbProps.load(new FileInputStream(connections.properties))
[error]                                    ^
[error] C:\YearPartition\src\main\scala\com\yearpartition\obj\PartitionRetrieval.scala:10:68: not found: value connectionProperties
[error]     val tabData = spark.read.jdbc(dBManager.jdbcDevUrl,"tableName",connectionProperties)
[error]                                                                    ^
[error] four errors found
[error] (Compile / compileIncremental) Compilation failed
[error] Total time: 45 s, completed Jul 19, 2018 6:06:25 PM

Process finished with exit code 1



This is my build.sbt file:

name := "YearPartition"

version := "0.1"

scalaVersion := "2.11.8"

// https://mvnrepository.com/artifact/org.apache.spark/spark-core
libraryDependencies += "org.apache.spark" %% "spark-core" % "2.2.0"

// https://mvnrepository.com/artifact/org.apache.spark/spark-sql
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.2.0"

Could anyone let me know how can I fix the version miss match error and load the properties file correctly.