Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

How to connect to MySQL, Hive and HBase in HDP2.5 sandbox from Eclipse IDE using SparkSQL code

How to connect to MySQL, Hive and HBase in HDP2.5 sandbox from Eclipse IDE using SparkSQL code

New Contributor

Hello,

I am new to Spark and HDP. As part of learning, I have written SparkSQL code in the Eclipse IDE in my local machine and trying to connect to the mySQL running inside the HDP 2.5 sandbox (VMware) and perform few operations.

However, it is failing with the below error message:

com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure Last packet sent to the server was 0 ms ago. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

Here I am attaching the code written. Please let me where I am going wrong. Please help with any document or link with the procedure/steps to follow to achieve this.

I tried by replacing the URL with the IP and different port numbers but no luck. Your quick help is highly appreciable.

Code:

import org.apache.spark.SparkConf
import org.apache.spark.SparkContext
import org.apache.spark.SparkContext._
import org.apache.spark.rdd.JdbcRDD
import org.apache.spark.rdd.JdbcRDD
import java.sql.{Connection, DriverManager, ResultSet}
object SparkSqlJDBCExample {
  def main(args: Array[String]){
    val url = "jdbc:mysql://sandbox.hortonworks.com:2222/muradb"
    val userName = "root"
    val password = "hadoop"
    Class.forName("com.mysql.jdbc.Driver").newInstance()
    val sparkConf = new SparkConf().setAppName("Spark Sql JDBC").setMaster("local")
    val sc = new SparkContext(sparkConf)
    val myRDD = new JdbcRDD( sc, ()=>DriverManager.getConnection(url, userName, password),"select *from hospital", 3,5,1, r=>r.getString("ProviderCity"))
    myRDD.foreach(println)
  }
} 
Error:

17/06/23 11:12:27 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/06/23 11:12:27 INFO SecurityManager: Changing view acls to: pc 17/06/23 11:12:27 INFO SecurityManager: Changing modify acls to: pc 17/06/23 11:12:27 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(pc); users with modify permissions: Set(pc) 17/06/23 11:12:28 INFO Utils: Successfully started service 'sparkDriver' on port 63938. 17/06/23 11:12:28 INFO Slf4jLogger: Slf4jLogger started 17/06/23 11:12:28 INFO Remoting: Starting remoting 17/06/23 11:12:28 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriverActorSystem@192.168.61.1:63951] 17/06/23 11:12:28 INFO Utils: Successfully started service 'sparkDriverActorSystem' on port 63951. 17/06/23 11:12:28 INFO SparkEnv: Registering MapOutputTracker 17/06/23 11:12:28 INFO SparkEnv: Registering BlockManagerMaster 17/06/23 11:12:29 INFO DiskBlockManager: Created local directory at C:\Users\pc\AppData\Local\Temp\blockmgr-c0c5f8b2-7869-438e-b728-a6ffa5e29129 17/06/23 11:12:29 INFO MemoryStore: MemoryStore started with capacity 2.4 GB 17/06/23 11:12:29 INFO SparkEnv: Registering OutputCommitCoordinator 17/06/23 11:12:29 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/06/23 11:12:29 INFO SparkUI: Started SparkUI at http://192.168.61.1:4040 17/06/23 11:12:29 INFO Executor: Starting executor ID driver on host localhost 17/06/23 11:12:29 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 63958. 17/06/23 11:12:29 INFO NettyBlockTransferService: Server created on 63958 17/06/23 11:12:29 INFO BlockManagerMaster: Trying to register BlockManager 17/06/23 11:12:29 INFO BlockManagerMasterEndpoint: Registering block manager localhost:63958 with 2.4 GB RAM, BlockManagerId(driver, localhost, 63958) 17/06/23 11:12:29 INFO BlockManagerMaster: Registered BlockManager 17/06/23 11:12:29 INFO SparkContext: Starting job: foreach at SparkSqlJDBCExample.scala:25 17/06/23 11:12:29 INFO DAGScheduler: Got job 0 (foreach at SparkSqlJDBCExample.scala:25) with 1 output partitions 17/06/23 11:12:29 INFO DAGScheduler: Final stage: ResultStage 0 (foreach at SparkSqlJDBCExample.scala:25) 17/06/23 11:12:29 INFO DAGScheduler: Parents of final stage: List() 17/06/23 11:12:29 INFO DAGScheduler: Missing parents: List() 17/06/23 11:12:29 INFO DAGScheduler: Submitting ResultStage 0 (JdbcRDD[0] at JdbcRDD at SparkSqlJDBCExample.scala:23), which has no missing parents 17/06/23 11:12:31 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 1536.0 B, free 1536.0 B) 17/06/23 11:12:32 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 995.0 B, free 2.5 KB) 17/06/23 11:12:32 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on localhost:63958 (size: 995.0 B, free: 2.4 GB) 17/06/23 11:12:32 INFO SparkContext: Created broadcast 0 from broadcast at DAGScheduler.scala:1006 17/06/23 11:12:32 INFO DAGScheduler: Submitting 1 missing tasks from ResultStage 0 (JdbcRDD[0] at JdbcRDD at SparkSqlJDBCExample.scala:23) 17/06/23 11:12:32 INFO TaskSchedulerImpl: Adding task set 0.0 with 1 tasks 17/06/23 11:12:32 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, partition 0,PROCESS_LOCAL, 1907 bytes) 17/06/23 11:12:32 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 17/06/23 11:14:32 INFO JdbcRDD: closed connection 17/06/23 11:14:32 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure Last packet sent to the server was 0 ms ago. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:403) at com.mysql.jdbc.SQLError.createCommunicationsException(SQLError.java:1074) at com.mysql.jdbc.ConnectionImpl.createNewIO(ConnectionImpl.java:2037) at com.mysql.jdbc.ConnectionImpl.<init>(ConnectionImpl.java:718) at com.mysql.jdbc.JDBC4Connection.<init>(JDBC4Connection.java:46) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:423) at com.mysql.jdbc.Util.handleNewInstance(Util.java:403) at com.mysql.jdbc.ConnectionImpl.getInstance(ConnectionImpl.java:291) at com.mysql.jdbc.NonRegisteringDriver.connect(NonRegisteringDriver.java:283) at java.sql.DriverManager.getConnection(DriverManager.java:664) at java.sql.DriverManager.getConnection(DriverManager.java:247) at com.demo.Processor.SparkSqlJDBCExample$anonfun$1.apply(SparkSqlJDBCExample.scala:23) at com.demo.Processor.SparkSqlJDBCExample$anonfun$1.apply(SparkSqlJDBCExample.scala:23) at org.apache.spark.rdd.JdbcRDD$anon$1.<init>(JdbcRDD.scala:78) at org.apache.spark.rdd.JdbcRDD.compute(JdbcRDD.scala:74) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:306) at org.apache.spark.rdd.RDD.iterator(RDD.scala:270) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:213) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:748) Caused by: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure

2 REPLIES 2
Highlighted

Re: How to connect to MySQL, Hive and HBase in HDP2.5 sandbox from Eclipse IDE using SparkSQL code

Guru

Looks like MySQL is either not running at jdbc:mysql://sandbox.hortonworks.com:2222/murad or not excepting 'remote connections'. Take a look at your MySQL configuration and see if you can connect to this from any jdbc client.

Re: How to connect to MySQL, Hive and HBase in HDP2.5 sandbox from Eclipse IDE using SparkSQL code

Contributor

pls try that

[root@sandbox ~]# ls /usr/share/java/mysql-connector-java.jar /usr/share/java/mysql-connector-java.jar

[root@sandbox ~]# spark-shell --jars /usr/share/java/mysql-connector-java.jar