Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark SQL: NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/VariableSubstitution

Spark SQL: NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/VariableSubstitution

New Contributor

Hi! I'm trying to get Spark-MongoDB working. I use the example in the First_Steps document, but get numerous exceptions on run:

 

# pyspark --packages com.stratio.datasource:spark-mongodb_2.10:0.10.3

>>> from pyspark.sql import SQLContext

>>> sqlContext.sql("CREATE TEMPORARY TABLE col_table USING com.stratio.datasource.mongodb OPTIONS (host 'host:port', database 'db', collection 'col')")

 

[ ... ]

 

Exception in thread "Thread-1998" java.lang.NoClassDefFoundError: org/apache/hadoop/hive/ql/parse/VariableSubstitution at java.lang.Class.getDeclaredMethods0(Native Method) at java.lang.Class.privateGetDeclaredMethods(Class.java:2625) at java.lang.Class.privateGetPublicMethods(Class.java:2743) at java.lang.Class.getMethods(Class.java:1480) at py4j.reflection.ReflectionEngine.getMethodsByNameAndLength(ReflectionEngine.java:365) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:317) at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342) at py4j.Gateway.invoke(Gateway.java:252) at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) at py4j.commands.CallCommand.execute(CallCommand.java:79) at py4j.GatewayConnection.run(GatewayConnection.java:207) at java.lang.Thread.run(Thread.java:745)

 

 

According to log lines at pyspark startup, both spark.executor.extraClassPath and spark.driver.extraClassPath contain jars from /usr/lib/hive/lib/

 

2016-05-10 13:59:10,404 WARN  [Thread-2] spark.SparkConf (Logging.scala:logWarning(71)) - Setting 'spark.executor.extraClassPath' to '/usr/lib/hive/lib/stax-api-1.0.1.jar:/usr/lib/hive/lib/gson-2.2.4.jar:/usr/lib/hive/lib/geronimo-jaspic_1.0_spec-1.0.jar:/usr/lib/hive/lib/hamcrest-core-1.1.jar:/usr/lib/hive/lib/commons-math-2.1.jar:/usr/lib/hive/lib/datanucleus-rdbms-3.2.9.jar:/usr/lib/hive/lib/high-scale-lib-1.1.1.jar:/usr/lib/hive/lib/asm-3.2.jar:/usr/lib/hive/lib/hive-jdbc-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/oro-2.0.8.jar:/usr/lib/hive/lib/commons-io-2.4.jar:/usr/lib/hive/lib/groovy-all-2.4.4.jar:/usr/lib/hive/lib/commons-compiler-2.7.6.jar:/usr/lib/hive/lib/maven-scm-provider-svnexe-1.4.jar:/usr/lib/hive/lib/hive-cli.jar:/usr/lib/hive/lib/avro.jar:/usr/lib/hive/lib/jetty-all-server-7.6.0.v20120127.jar:/usr/lib/hive/lib/accumulo-core-1.6.0.jar:/usr/lib/hive/lib/plexus-utils-1.5.6.jar:/usr/lib/hive/lib/jasper-runtime-5.5.23.jar:/usr/lib/hive/lib/hive-metastore.jar:/usr/lib/hive/lib/pentaho-aggdesigner-algorithm-5.1.5-jhyde.jar:/usr/lib/hive/lib/ant-1.9.1.jar:/usr/lib/hive/lib/paranamer-2.3.jar:/usr/lib/hive/lib/hive-testutils.jar:/usr/lib/hive/lib/commons-compress-1.4.1.jar:/usr/lib/hive/lib/maven-scm-provider-svn-commons-1.4.jar:/usr/lib/hive/lib/jackson-annotations-2.2.2.jar:/usr/lib/hive/lib/asm-tree-3.1.jar:/usr/lib/hive/lib/hive-hwi.jar:/usr/lib/hive/lib/jackson-xc-1.9.2.jar:/usr/lib/hive/lib/mail-1.4.1.jar:/usr/lib/hive/lib/jta-1.1.jar:/usr/lib/hive/lib/commons-digester-1.8.jar:/usr/lib/hive/lib/antlr-runtime-3.4.jar:/usr/lib/hive/lib/commons-configuration-1.6.jar:/usr/lib/hive/lib/snappy-java-1.0.4.1.jar:/usr/lib/hive/lib/velocity-1.5.jar:/usr/lib/hive/lib/tempus-fugit-1.1.jar:/usr/lib/hive/lib/hive-shims.jar:/usr/lib/hive/lib/jdo-api-3.0.1.jar:/usr/lib/hive/lib/hive-hbase-handler.jar:/usr/lib/hive/lib/guava-14.0.1.jar:/usr/lib/hive/lib/ST4-4.0.4.jar:/usr/lib/hive/lib/libthrift-0.9.2.jar:/usr/lib/hive/lib/metrics-core-3.0.2.jar:/usr/lib/hive/lib/hive-cli-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/hive-contrib-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/commons-beanutils-core-1.8.0.jar:/usr/lib/hive/lib/jcommander-1.32.jar:/usr/lib/hive/lib/hive-accumulo-handler-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/hive-beeline.jar:/usr/lib/hive/lib/commons-beanutils-1.7.0.jar:/usr/lib/hive/lib/geronimo-annotation_1.0_spec-1.1.1.jar:/usr/lib/hive/lib/commons-pool-1.5.4.jar:/usr/lib/hive/lib/zookeeper.jar:/usr/lib/hive/lib/hive-exec-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/hive-shims-scheduler-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/regexp-1.3.jar:/usr/lib/hive/lib/joda-time-1.6.jar:/usr/lib/hive/lib/hive-accumulo-handler.jar:/usr/lib/hive/lib/calcite-linq4j-1.0.0-incubating.jar:/usr/lib/hive/lib/antlr-2.7.7.jar:/usr/lib/hive/lib/parquet-hadoop-bundle.jar:/usr/lib/hive/lib/hive-hwi-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/commons-codec-1.4.jar:/usr/lib/hive/lib/hive-shims-scheduler.jar:/usr/lib/hive/lib/hive-jdbc.jar:/usr/lib/hive/lib/jamon-runtime-2.3.1.jar:/usr/lib/hive/lib/httpclient-4.2.5.jar:/usr/lib/hive/lib/super-csv-2.2.0.jar:/usr/lib/hive/lib/curator-recipes-2.6.0.jar:/usr/lib/hive/lib/metrics-json-3.0.2.jar:/usr/lib/hive/lib/junit-4.11.jar:/usr/lib/hive/lib/hive-metastore-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/hive-testutils-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/activation-1.1.jar:/usr/lib/hive/lib/accumulo-start-1.6.0.jar:/usr/lib/hive/lib/hive-ant.jar:/usr/lib/hive/lib/logredactor-1.0.3.jar:/usr/lib/hive/lib/hive-common.jar:/usr/lib/hive/lib/eigenbase-properties-1.1.4.jar:/usr/lib/hive/lib/geronimo-jta_1.1_spec-1.1.1.jar:/usr/lib/hive/lib/hive-jdbc-standalone.jar:/usr/lib/hive/lib/commons-dbcp-1.4.jar:/usr/lib/hive/lib/jetty-all-7.6.0.v20120127.jar:/usr/lib/hive/lib/commons-vfs2-2.0.jar:/usr/lib/hive/lib/jersey-servlet-1.14.jar:/usr/lib/hive/lib/stringtemplate-3.2.1.jar:/usr/lib/hive/lib/log4j-1.2.16.jar:/usr/lib/hive/lib/maven-scm-api-1.4.jar:/usr/lib/hive/lib/bonecp-0.8.0.RELEASE.jar:/usr/lib/hive/lib/derby-10.11.1.1.jar:/usr/lib/hive/lib/findbugs-annotations-1.3.9-1.jar:/usr/lib/hive/lib/xz-1.0.jar:/usr/lib/hive/lib/metrics-jvm-3.0.2.jar:/usr/lib/hive/lib/commons-collections-3.2.2.jar:/usr/lib/hive/lib/curator-client-2.6.0.jar:/usr/lib/hive/lib/hive-serde-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/hive-hbase-handler-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/ant-launcher-1.9.1.jar:/usr/lib/hive/lib/commons-el-1.0.jar:/usr/lib/hive/lib/apache-log4j-extras-1.2.17.jar:/usr/lib/hive/lib/hive-exec.jar:/usr/lib/hive/lib/jline-2.12.jar:/usr/lib/hive/lib/hive-shims-common-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/calcite-core-1.0.0-incubating.jar:/usr/lib/hive/lib/hive-common-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/servlet-api-2.5.jar:/usr/lib/hive/lib/hive-shims-common.jar:/usr/lib/hive/lib/accumulo-trace-1.6.0.jar:/usr/lib/hive/lib/commons-httpclient-3.0.1.jar:/usr/lib/hive/lib/curator-framework-2.6.0.jar:/usr/lib/hive/lib/libfb303-0.9.2.jar:/usr/lib/hive/lib/commons-logging-1.1.3.jar:/usr/lib/hive/lib/hive-service.jar:/usr/lib/hive/lib/hbase-annotations.jar:/usr/lib/hive/lib/hive-service-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/jackson-core-2.2.2.jar:/usr/lib/hive/lib/calcite-avatica-1.0.0-incubating.jar:/usr/lib/hive/lib/jsp-api-2.1.jar:/usr/lib/hive/lib/datanucleus-api-jdo-3.2.6.jar:/usr/lib/hive/lib/commons-lang-2.6.jar:/usr/lib/hive/lib/jackson-databind-2.2.2.jar:/usr/lib/hive/lib/accumulo-fate-1.6.0.jar:/usr/lib/hive/lib/jackson-jaxrs-1.9.2.jar:/usr/lib/hive/lib/commons-cli-1.2.jar:/usr/lib/hive/lib/jpam-1.1.jar:/usr/lib/hive/lib/hive-shims-0.23-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/hive-beeline-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/hive-shims-0.23.jar:/usr/lib/hive/lib/hive-jdbc-1.1.0-cdh5.7.0-standalone.jar:/usr/lib/hive/lib/jsr305-3.0.0.jar:/usr/lib/hive/lib/jasper-compiler-5.5.23.jar:/usr/lib/hive/lib/hive-shims-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/asm-commons-3.1.jar:/usr/lib/hive/lib/opencsv-2.3.jar:/usr/lib/hive/lib/jersey-server-1.14.jar:/usr/lib/hive/lib/hive-ant-1.1.0-cdh5.7.0.jar:/usr/lib/hive/lib/datanucleus-core-3.2.10.jar:/usr/lib/hive/lib/hive-contrib.jar:/usr/lib/hive/lib/hive-serde.jar:/usr/lib/hive/lib/janino-2.7.6.jar:/usr/lib/hive/lib/httpcore-4.2.5.jar::/usr/lib/spark/lib/spark-assembly.jar::/usr/lib/hadoop/lib/*:/usr/lib/hadoop/*:/usr/lib/hadoop-hdfs/lib/*:/usr/lib/hadoop-hdfs/*:/usr/lib/hadoop-mapreduce/lib/*:/usr/lib/hadoop-mapreduce/*:/usr/lib/hadoop-yarn/lib/*:/usr/lib/hadoop-yarn/*:/usr/lib/hive/lib/*:/usr/lib/flume-ng/lib/*:/usr/lib/paquet/lib/*:/usr/lib/avro/lib/*' as a work-around.

 

 

I'm using Spark 1.5.0+cdh5.6.0+113-1.cdh5.6.0.p0.104~trusty-cdh5.6.0 from cloudera. I'll provide any other information if needed. Could you point out where I am wrong here?