Support Questions
Find answers, ask questions, and share your expertise

Can't run Zeppelin with Spark and external jars

New Contributor

Hi all,

I'm trying to get zeppelin work with spark, using Oracle JDBC driver (ojdbc7.jar). I have already set the spark.files and spark.driver.extraClassPath attributes in the Custom spark-defaults configurations and SPARK_SUBMIT_OPTIONS in zeppelin-env template in Ambari interface, but still get the erro below:

java.lang.ClassNotFoundException: oracle.jdbc.OracleDriver at scala.tools.nsc.interpreter.AbstractFileClassLoader.findClass(AbstractFileClassLoader.scala:83) at java.lang.ClassLoader.loadClass(ClassLoader.java:425) at java.lang.ClassLoader.loadClass(ClassLoader.java:358) at org.apache.spark.sql.execution.datasources.jdbc.DriverRegistry$.register(DriverRegistry.scala:38) at org.apache.spark.sql.execution.datasources.jdbc.DefaultSource.createRelation(DefaultSource.scala:41) at org.apache.spark.sql.execution.datasources.ResolvedDataSource$.apply(ResolvedDataSource.scala:158) at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:119) at $iwC$iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:33) at $iwC$iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:38) at $iwC$iwC$iwC$iwC$iwC$iwC.<init>(<console>:40) at $iwC$iwC$iwC$iwC$iwC.<init>(<console>:42) at $iwC$iwC$iwC$iwC.<init>(<console>:44) at $iwC$iwC$iwC.<init>(<console>:46) at $iwC$iwC.<init>(<console>:48) at $iwC.<init>(<console>:50) at <init>(<console>:52) at .<init>(<console>:56) at .<clinit>(<console>) at .<init>(<console>:7) at .<clinit>(<console>) at $print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:1065) at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1346) at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:840) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:871) at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:819) at org.apache.zeppelin.spark.SparkInterpreter.interpretInput(SparkInterpreter.java:709) at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:673) at org.apache.zeppelin.spark.SparkInterpreter.interpret(SparkInterpreter.java:666) at org.apache.zeppelin.interpreter.ClassloaderInterpreter.interpret(ClassloaderInterpreter.java:57) at org.apache.zeppelin.interpreter.LazyOpenInterpreter.interpret(LazyOpenInterpreter.java:93) at org.apache.zeppelin.interpreter.remote.RemoteInterpreterServer$InterpretJob.jobRun(RemoteInterpreterServer.java:295) at org.apache.zeppelin.scheduler.Job.run(Job.java:171) at org.apache.zeppelin.scheduler.FIFOScheduler$1.run(FIFOScheduler.java:139) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178) at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745)

The code used:

(Scala)

val url = "jdbc:oracle:thin:user/pass@//127.0.0.1:1521/oracl"

val dbtable = "(select * from dwoad.unidade) unidade"

val driver = "oracle.jdbc.OracleDriver"

val table = sqlContext.read.format("jdbc").options(Map("url" -> url, "dbtable" -> dbtable, "driver" -> driver)).load()

(PySpark interpreter)

%pyspark url = "jdbc:oracle:thin:user/pass@//127.0.0.1:1521/oracl"

dbtable = "(select * from dwoad.unidade) unidade"

driver = "oracle.jdbc.OracleDriver"

table = (sqlContext.read.format("jdbc") .options(url=url, dbtable=dbtable, driver=driver) .load())

Does anybody could help me. Am I missing something? I'm using the Zeppelin Notebook 0.0.5 (zeppelin-web-0.6.0.2.4.0.0-169.war), Spark 1.6.0.2.4 and Sandbox 2.4..

Thanks in advance.

3 REPLIES 3

New Contributor

Hi @cduby, thank you for replying.

I read the entry and did some configurations, but still the same. It seems that the ojdbc7.jar is not loaded by zeppelin, even when I load it with the dep interpreter first.:

%dep

z.reset() //z.load("joda-time:joda-time:2.9.1")

z.addRepo("maven oracle").url("http://maven.oracle.com").username("me@prov.com").password("1234")

z.load("com.oracle.jdbc:ojdbc7:12.1.0.2")

New Contributor

Actually I think it was loaded now, but when I run the %pyspark or scala code the cell ends up with ERROR status and no stacktrace.

Do you know which logs should I analyze?

Thanks again.