Support Questions
Find answers, ask questions, and share your expertise

HDP 2.5 Zeppelin and Spark


Hi all,

I have a kerberized cluster with HDP 2.5. I would like to use Zeppelin 0.6 and Spark2, but I have seen that there are many restrictions and problems, so at least I would like to try Zeppelin 0.6 with Spark 1.6

I followed the instructions and I configured Zeppelin with my AD. Also I would like to use impersonation, I think it is mandatory to execute a job with the user and not with a common zeppelin user (specially in order to read and write to HDFS).

I followed many other threads and is not working and nothing is clear.

Is anyone here with a HDP 2.5 and Zeppelin working with Spark and livy (for impersonation)?

In my case, when I try the following is Zeppelin:


I obtain:

Interpreter died:
Traceback (most recent call last):
  File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/tmp/7818688309791970952", line 469, in <module>
  File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/tmp/7818688309791970952", line 394, in main
    exec 'from import sc' in global_dict
  File "<string>", line 1, in <module>
  File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/", line 43, in <module>
  File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/", line 115, in __init__
  File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/", line 172, in _do_init
  File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/", line 235, in _initialize_context
  File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/", line 1064, in __call__
  File "/grid/4/hadoop/yarn/local/usercache/jmolero/appcache/application_1486188076080_0234/container_e14_1486188076080_0234_01_000001/", line 308, in get_return_value
py4j.protocol.Py4JJavaError: An error occurred while calling
: Added file file:/usr/hdp/current/spark-client/conf/hive-site.xml does not exist.
	at org.apache.spark.SparkContext.addFile(SparkContext.scala:1388)
	at org.apache.spark.SparkContext.addFile(SparkContext.scala:1364)
	at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
	at org.apache.spark.SparkContext$$anonfun$15.apply(SparkContext.scala:491)
	at scala.collection.immutable.List.foreach(List.scala:318)
	at org.apache.spark.SparkContext.<init>(SparkContext.scala:491)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(
	at java.lang.reflect.Constructor.newInstance(
	at py4j.reflection.MethodInvoker.invoke(
	at py4j.reflection.ReflectionEngine.invoke(
	at py4j.Gateway.invoke(
	at py4j.commands.ConstructorCommand.invokeConstructor(
	at py4j.commands.ConstructorCommand.execute(


I could upgrade from HDP 2.5 to HDP 2.6 but I know that most probably is not going to work and the problem will be worst (and even zeppelin will continue not working)

Thanks in advance


Re: HDP 2.5 Zeppelin and Spark

Super Collaborator

@Jose Molero

Please install spark-client on all the nodemanager, error you see with livy.pyspark is due to the missing spark-clients on nodemanager. Make sure to refresh clients after installation for the configs to copy in hosts.