Support Questions

Find answers, ask questions, and share your expertise
Announcements
Check out our newest addition to the community, the Cloudera Data Analytics (CDA) group hub.

HDP 2.6 spark on zeppelin doesn't work (nosuchmethod error)

Hi,

I finished a fresh HDP 2.6 installation last week and since then I cannot use any spark interpreters (%spark and %spark2)

I tried to run the spark samples directly on the server and it worked well (https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.1/bk_spark-component-guide/content/run-spark2...)

But when I want to use it from zeppelin, I get this error :

Verify Spark Version (should be 2.x)

%spark2.spark
spark.version

org.apache.zeppelin.interpreter.InterpreterException: Exception in thread "main" java.lang.NoSuchMethodError: scala.Predef$.$conforms()Lscala/Predef$$less$colon$less;
	at org.apache.spark.util.Utils$.getDefaultPropertiesFile(Utils.scala:2086)
	at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:118)
	at org.apache.spark.deploy.SparkSubmitArguments$$anonfun$mergeDefaultSparkProperties$1.apply(SparkSubmitArguments.scala:118)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.deploy.SparkSubmitArguments.mergeDefaultSparkProperties(SparkSubmitArguments.scala:118)
	at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:104)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterManagedProcess.start(RemoteInterpreterManagedProcess.java:149)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreterProcess.reference(RemoteInterpreterProcess.java:73)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.open(RemoteInterpreter.java:260)
	at org.apache.zeppelin.interpreter.remote.RemoteInterpreter.getFormType(RemoteInterpreter.java:425)
	at org.apache.zeppelin.interpreter.LazyOpenInterpreter.getFormType(LazyOpenInterpreter.java:111)
	at org.apache.zeppelin.notebook.Paragraph.jobRun(Paragraph.java:387)
	at org.apache.zeppelin.scheduler.Job.run(Job.java:175)
	at org.apache.zeppelin.scheduler.RemoteScheduler$JobRunner.run(RemoteScheduler.java:329)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)

It get exactly the same error when I use %spark and %spark2

I did some researches about this error, and it seemed to be a compatibility problem between spark versions, but I saw nothing that helped me for zeppelin on HDP.

I tried to follow this tuto and evething worked excepted the zeppelin part :
https://fr.hortonworks.com/blog/try-apache-spark-2-1-zeppelin-hortonworks-data-cloud/

More informations about my cluster :

3 nodes on centos7

Zeppelin, spark, spark2 and livy server are on the same host

my zeppelin-env.sh file :

# export JAVA_HOME=
export JAVA_HOME={{java64_home}}
# export MASTER=                              # Spark master url. eg. spark://master_addr:7077. Leave empty if you want to use local mode.
export MASTER=yarn-client
export SPARK_YARN_JAR={{spark_jar}}
# export ZEPPELIN_JAVA_OPTS                   # Additional jvm options. for example, export ZEPPELIN_JAVA_OPTS="-Dspark.executor.memory=8g -Dspark.cores.max=16"
export ZEPPELIN_JAVA_OPTS="-Dhdp.version={{full_stack_version}} -Dspark.executor.memory={{executor_mem}} -Dspark.executor.instances={{executor_instances}} -Dspark.yarn.queue={{spark_queue}}"
# export ZEPPELIN_MEM                         # Zeppelin jvm mem options Default -Xms1024m -Xmx1024m -XX:MaxPermSize=512m
# export ZEPPELIN_INTP_MEM                    # zeppelin interpreter process jvm mem options. Default -Xms1024m -Xmx1024m -XX:MaxPermSize=512m
# export ZEPPELIN_INTP_JAVA_OPTS              # zeppelin interpreter process jvm options.
# export ZEPPELIN_SSL_PORT                    # ssl port (used when ssl environment variable is set to true)


# export ZEPPELIN_LOG_DIR                     # Where log files are stored.  PWD by default.
export ZEPPELIN_LOG_DIR={{zeppelin_log_dir}}
# export ZEPPELIN_PID_DIR                     # The pid files are stored. ${ZEPPELIN_HOME}/run by default.
export ZEPPELIN_PID_DIR={{zeppelin_pid_dir}}
# export ZEPPELIN_WAR_TEMPDIR                 # The location of jetty temporary directory.
# export ZEPPELIN_NOTEBOOK_DIR                # Where notebook saved
# export ZEPPELIN_NOTEBOOK_HOMESCREEN         # Id of notebook to be displayed in homescreen. ex) 2A94M5J1Z
# export ZEPPELIN_NOTEBOOK_HOMESCREEN_HIDE    # hide homescreen notebook from list when this value set to "true". default "false"
# export ZEPPELIN_NOTEBOOK_S3_BUCKET          # Bucket where notebook saved
# export ZEPPELIN_NOTEBOOK_S3_ENDPOINT        # Endpoint of the bucket
# export ZEPPELIN_NOTEBOOK_S3_USER            # User in bucket where notebook saved. For example bucket/user/notebook/2A94M5J1Z/note.json
# export ZEPPELIN_IDENT_STRING                # A string representing this instance of zeppelin. $USER by default.
# export ZEPPELIN_NICENESS                    # The scheduling priority for daemons. Defaults to 0.
# export ZEPPELIN_INTERPRETER_LOCALREPO       # Local repository for interpreter's additional dependency loading
# export ZEPPELIN_NOTEBOOK_STORAGE            # Refers to pluggable notebook storage class, can have two classes simultaneously with a sync between them (e.g. local and remote).
# export ZEPPELIN_NOTEBOOK_ONE_WAY_SYNC       # If there are multiple notebook storages, should we treat the first one as the only source of truth?
# export ZEPPELIN_NOTEBOOK_PUBLIC             # Make notebook public by default when created, private otherwise
export ZEPPELIN_INTP_CLASSPATH_OVERRIDES="{{external_dependency_conf}}"


#### Spark interpreter configuration ####


## Use provided spark installation ##
## defining SPARK_HOME makes Zeppelin run spark interpreter process using spark-submit
##
#export SPARK_HOME=                          # (required) When it is defined, load it instead of Zeppelin embedded Spark libraries
#export SPARK_HOME={{spark_home}}
# export SPARK_SUBMIT_OPTIONS                 # (optional) extra options to pass to spark submit. eg) "--driver-memory 512M --executor-memory 1G".
export SPARK_APP_NAME=Zeppelin-Spark                     # (optional) The name of spark application.


## Use embedded spark binaries ##
## without SPARK_HOME defined, Zeppelin still able to run spark interpreter process using embedded spark binaries.
## however, it is not encouraged when you can define SPARK_HOME
##
# Options read in YARN client mode
# export HADOOP_CONF_DIR                      # yarn-site.xml is located in configuration directory in HADOOP_CONF_DIR.
export HADOOP_CONF_DIR=/etc/hadoop/conf
# Pyspark (supported with Spark 1.2.1 and above)
# To configure pyspark, you need to set spark distribution's path to 'spark.home' property in Interpreter setting screen in Zeppelin GUI
# export PYSPARK_PYTHON                       # path to the python command. must be the same path on the driver(Zeppelin) and all workers.
# export PYTHONPATH


export PYTHONPATH="${SPARK_HOME}/python:${SPARK_HOME}/python/lib/py4j-0.8.2.1-src.zip"
export SPARK_YARN_USER_ENV="PYTHONPATH=${PYTHONPATH}"


## Spark interpreter options ##
##
# export ZEPPELIN_SPARK_USEHIVECONTEXT        # Use HiveContext instead of SQLContext if set true. true by default.
# export ZEPPELIN_SPARK_CONCURRENTSQL         # Execute multiple SQL concurrently if set true. false by default.
# export ZEPPELIN_SPARK_IMPORTIMPLICIT        # Import implicits, UDF collection, and sql if set true. true by default.
# export ZEPPELIN_SPARK_MAXRESULT             # Max number of Spark SQL result to display. 1000 by default.
# export ZEPPELIN_WEBSOCKET_MAX_TEXT_MESSAGE_SIZE       # Size in characters of the maximum text message to be received by websocket. Defaults to 1024000




#### HBase interpreter configuration ####


## To connect to HBase running on a cluster, either HBASE_HOME or HBASE_CONF_DIR must be set


# export HBASE_HOME=                          # (require) Under which HBase scripts and configuration should be
# export HBASE_CONF_DIR=                      # (optional) Alternatively, configuration directory can be set to point to the directory that has hbase-site.xml


# export ZEPPELIN_IMPERSONATE_CMD             # Optional, when user want to run interpreter as end web user. eg) 'sudo -H -u ${ZEPPELIN_IMPERSONATE_USER} bash -c '

If someone already had this problem or has an idea to correct it, I would be really thankfull if you thanksfull !

Thanks in advance,
Félicien

4 REPLIES 4

@Félicien Catherin

Did you modify the zeppelin-env.sh at all? That command you are running should work fine, so I'm wondering if something was modified within your configs as part of the install.

I am running HDP 2.6 and was able to run a simple test (as you did above). It worked as expected. I've included the commands that I ran as well as the parameter settings within my Zeppelin interpreter. Hopefully this is helpful as a comparison for you.

28403-screen-shot-2017-08-16-at-122746-pm.png

28404-screen-shot-2017-08-16-at-122730-pm.png

Thanks again @Dan Zaratsian,

but can you also show me your spark2 environment (visible from the spark2 history server) ? Maibe it can help me find a problem in others configurations or in the jar versions..

Hi @Dan Zaratsian,
Thanks for your answer, I did not change anything in the file, ans my spark2 properties are exactly the same than yours...
I also tried to uninstall the services and put them back again and it did'nt change anything

Contributor

@Félicien Catherin have you set CLASSPATH variable in zeppelin-env, if yes, please comment it and try again.

Take a Tour of the Community
Don't have an account?
Your experience may be limited. Sign in to explore more.