Support Questions
Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Innovation Accelerator group hub.

How to import External Libraries for Livy Interpreter using zeppelin (Using Yarn cluser mode) ?

Explorer

I don't have any problem to import external library for Spark Interpreter using SPARK_SUBMIT_OPTIONS.

This method doesn't work with Livy Interpreter.

What is the best solution to import external library for Livy Interpreter using zeppelin ?

I prefer to import from local JARs without having to use remote repositories.

Thank you in advance.

1 ACCEPTED SOLUTION

You can load dynamic library to livy interpreter by set livy.spark.jars.packages property to comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. The format for the coordinates should be groupId:artifactId:version.

Example

PropertyExampleDescription
livy.spark.jars.packagesio.spray:spray-json_2.10:1.3.1Adding extra libraries to livy interpreter

https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/interpreter/livy.html#adding-external-libraries

View solution in original post

17 REPLIES 17

Contributor

@A. Karray You can specify JARs to use with Livy jobs using livy.spark.jars in the Livy interpreter conf. This should be a comma separated list of JAR locations which must be stored on HDFS. Currently local files cannot be used (i.e. they won't be localized on the cluster when the job runs.) It is a global setting so all JARs listed will be available for all Livy jobs run by all users.

Explorer

This solution doesn't work for me with yarn cluster mode configuration.

When I print sc.jars I can see that i have added the dependencies : hdfs:///user/zeppelin/lib/postgresql-9.4-1203-jdbc42.jar

But I's not possible to import any class of the Jar

<console>:30: error: object postgresql is not a member of package org import org.postgresql.Driver

Explorer

Hi, did you find a solution?

I have the same pb ...

Thanks

You can load dynamic library to livy interpreter by set livy.spark.jars.packages property to comma-separated list of maven coordinates of jars to include on the driver and executor classpaths. The format for the coordinates should be groupId:artifactId:version.

Example

PropertyExampleDescription
livy.spark.jars.packagesio.spray:spray-json_2.10:1.3.1Adding extra libraries to livy interpreter

https://zeppelin.apache.org/docs/0.7.0-SNAPSHOT/interpreter/livy.html#adding-external-libraries

New Contributor

Hi,

This works fine for artifacts in maven central repository. Do you know if there is a way to define a custom maven remote repository?

I have tried using the livy.spark.jars.ivy according to the link below, but Livy still tries to retrieve the artifact from maven central.

http://spark.apache.org/docs/latest/configuration.html

Thanks!

Explorer

This solution doesn't work for me with yarn cluster mode configuration.

Explorer

Hi ,

did you find a solution to include libraries from internal maven repository?

When I inspect log files, I can see that livy tries to resolve dependencies with

http://dl.bintray.com/spark-packages, https://repo1.maven.org/, local-m2-cache.

Is there a way to add custom maven repository?

I'm using Ambari and Zeppelin

Thanks

Rising Star

The jars should be able to be added by using the parameter key livy.spark.jars and pointing to an hdfs location in the livy interpreter settings. This does not seem to work. I had to place the needed jar in the following directory on the livy server:

/usr/hdp/2.5.3.0-37/livy/repl-jars

Explorer

Hi Ian ,

Thanks for your response, unfortunately it doesn't work.

I've added all jars in the /usr/hdp/current/livy-server/repl-jars folder.

In Spark environment I can see them with those properties:

  • spark.yarn.secondary.jars
  • spark.jars

All jars are present into the container folder :

hadoop/yarn/local/usercache/mgervais/appcache/application_1481623014483_0014/container_e24_1481623014483_0014_01_000001

I'm using Zeppelin, Livy & Spark. (Installed with Ambari.)

Any idea?

Thanks

Rising Star

@Mickaël GERVAIS check to make sure livy interpreter is listed in the interpreter bindings for the notebook. Also, set DEBUG on the livy server and check in the livy out file produced on the server. Finally, make sure you have restarted livy and zeppelin to pick up the changes. I tested and it did work for me.

Explorer

@Ian Roberts I'm sorry , but I cannot make it work.

Here is my error:

import org.apache.commons.codec.binary.Base64
import java.time.LocalDateTime.now
import com.mongodb.BasicDBObjectBuilder.start

import org.apache.commons.codec.binary.Base64
import java.time.LocalDateTime.now
<console>:35: error: object mongodb is not a member of package com
         import com.mongodb.BasicDBObjectBuilder.start

The jar mongo-java-driver-2.14.3.jar is present in:

  • livy-server/repl-jars
  • hdfs:///user/mgervais/.sparkStaging/application_1481647493263_0001

And the Spark UI show those properties:

  • spark.yarn.secondary.jars : ...mongo-java-driver-2.14.3.jar...
  • spark.jars : ...file:/usr/hdp/current/livy-server/repl-jars/mongo-java-driver-2.14.3.jar...

I've the livy interpreter enable in the Zeppelin notebook (It works , I can see the livy sessions...)

I've restarted the full Cluster with Ambari...

Thanks...

Rising Star

This does not look like an issue with the jar being included but rather an issue with the import statement. I breifly looked on google and see similar descriptions stating to try org.mongodb. I would focus on the import statement more than the inclusion of the jar for livy.

Explorer

Okay, but I've the same issue with others imports which are not parts of native libraries.

My own jar cannot be included neither....

Is this a problem with Zeppelin notebook?

Explorer

Hi,

I'm using this version of HDP : /usr/hdp/2.5.0.0-1245/

Is this the reason? Should I upgrade my stack?

New Contributor

Hi all,

Has anyone succeeded in using Livy with a custom maven repository?

New Contributor

hello @Laurence Da Luz

I am trying to run CRAN packages on Notebook. I downloaded the packages, and installed them on server, but somehow Zeppelin Notebook does not picks the package, for instance

%livy.sparkr

library(data.table)

returns in error on notebook.

Hope you can assist in that.

Rachna

Super Collaborator

@Rachna Dhand I know you must be way past this issue, but -- You have to install the packages on all NodeManager nodes as root so they are available to all users. Maybe this will help someone else in the future.