Support Questions

Find answers, ask questions, and share your expertise

Spark job failed when new HiveContext object

Explorer

We are using HDP 2.3.4. I also followed the instructions below.

  • spark-submit \
  • --class <Your.class.name> \
  • --master yarn-cluster \
  • --num-executors 1 \
  • --driver-memory 1g \
  • --executor-memory 1g \
  • --executor-cores 1 \
  • --files /usr/hdp/current/spark-client/conf/hive-site.xml \
  • --jars /usr/hdp/current/spark-client/lib/datanucleus-api-jdo-3.2.6.jar,/usr/hdp/current/spark-client/lib/datanucleus-rdbms-3.2.9.jar,/usr/hdp/current/spark-client/lib/datanucleus-core-3.2.10.jar \
  • target/YOUR_JAR-1.0.0-SNAPSHOT.jar "show tables""select * from your_table"
  • Here is the callstack:

    16/09/06 15:20:35 WARN Hive: Failed to access metastore. This class should not accessed in runtime. org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:193) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183) at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179) at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:228) at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:187) at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:394) at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:176) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:179) at com.cbt.ingest.tsz.TSZIngestApp$delayedInit$body.apply(TSZIngestApp.scala:50) at scala.Function0$class.apply$mcV$sp(Function0.scala:40) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$anonfun$main$1.apply(App.scala:71) at scala.App$anonfun$main$1.apply(App.scala:71) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)

    1 ACCEPTED SOLUTION

    Explorer

    The problem is resolved by using SQLContext in spark application code. Thanks for quick response.

    View solution in original post

    16 REPLIES 16

    Could you post your spark code?

    Also, could you post any additional context in the stack trace. Are there additional exceptions?

    Explorer

    It failed at the beginning of my code to new a HiveContext object...

    log.warn("Running Master: " + master.toString()) val sparkConf = new SparkConf().setAppName(APP_NAME) .setMaster(master) val sc = SparkContext.getOrCreate(sparkConf) val sqlContext = new SQLContext(sc) val hiveSqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

    Explorer

    Actually I noticed there is successfully connect to metastore at the beginning and then later on it tried to connect it again and then failed.

    16/09/07 13:14:11 INFO DFSClient: Created HDFS_DELEGATION_TOKEN token 1297829 for jzhou5 on ha-hdfs:hd0

    16/09/07 13:14:12 INFO metastore: Trying to connect to metastore with URI thrift://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

    16/09/07 13:14:12 INFO metastore: Connected to metastore.

    16/09/07 13:14:12 INFO Client: Uploading resource file:/usr/hdp/2.3.4.0-3485/spark/lib/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar -> hdfs://hd0/user/.sparkStaging/application_1473231848025_1554/spark-assembly-1.5.2.2.3.4.0-3485-hadoop2.7.1.2.3.4.0-3485.jar

    ....

    16/09/07 13:14:24 INFO HiveContext: Initializing execution hive, version 1.2.1

    16/09/07 13:14:24 INFO ClientWrapper: Inspected Hadoop version: 2.7.1.2.3.4.0-3485

    16/09/07 13:14:24 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.7.1.2.3.4.0-3485

    16/09/07 13:14:24 INFO metastore: Trying to connect to metastore with URI thrift://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

    16/09/07 13:14:24 INFO metastore: Connected to metastore.

    16/09/07 13:14:24 INFO SessionState: Created local directory: /tmp/2a51b1f7-5c87-4b2a-95c6-bc7eb06d900b_resources

    16/09/07 13:14:24 INFO SessionState: Created HDFS directory: /tmp/hive/jzhou5/2a51b1f7-5c87-4b2a-95c6-

    ....

    16/09/07 13:14:25 INFO metastore: Trying to connect to metastore with URI thrift://xxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

    16/09/07 13:14:25 WARN Hive: Failed to access metastore. This class should not accessed in runtime.

    org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient

    at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236)

    at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174)

    at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166)

    at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503)

    at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:193)

    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)

    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57)

    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)

    Explorer

    Here are the full error output of the job:

    16/09/07 14:21:36 WARN Hive: Failed to access metastore. This class should not accessed in runtime. org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:193) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183) at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179) at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:228) at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:187) at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:394) at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:176) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:179) at com.cbt.ingest.tsz.TSZIngestApp$delayedInit$body.apply(TSZIngestApp.scala:53) at scala.Function0$class.apply$mcV$sp(Function0.scala:40) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$anonfun$main$1.apply(App.scala:71) at scala.App$anonfun$main$1.apply(App.scala:71) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32) at scala.App$class.main(App.scala:71) at com.cbt.ingest.tsz.TSZIngestApp.main(TSZIngestApp.scala:29) at com.cbt.ingest.tsz.GenericTSZIngest.main(TSZIngestApp.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:685) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1234) ... 34 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) ... 40 more Caused by: java.lang.IllegalStateException: Error finding hadoop SASL properties at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge23.getHadoopSaslProperties(HadoopThriftAuthBridge23.java:103) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getMetaStoreSaslProperties(MetaStoreUtils.java:1588) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:401) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) ... 45 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge23.getHadoopSaslProperties(HadoopThriftAuthBridge23.java:98) ... 49 more Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.util.StringUtils.toUpperCase(Ljava/lang/String;)Ljava/lang/String; at org.apache.hadoop.security.SaslPropertiesResolver.setConf(SaslPropertiesResolver.java:69) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.security.SaslPropertiesResolver.getInstance(SaslPropertiesResolver.java:58) ... 54 more 16/09/07 14:21:36 INFO metastore: Trying to connect to metastore with URI thrift://xxxxxxxxxxxxxxxxxxx:9083 Exception in thread "main" java.lang.RuntimeException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:522) at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:193) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183) at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179) at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:228) at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:187) at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:394) at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:176) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:179) at com.cbt.ingest.tsz.TSZIngestApp$delayedInit$body.apply(TSZIngestApp.scala:53) at scala.Function0$class.apply$mcV$sp(Function0.scala:40) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$anonfun$main$1.apply(App.scala:71) at scala.App$anonfun$main$1.apply(App.scala:71) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32) at scala.App$class.main(App.scala:71) at com.cbt.ingest.tsz.TSZIngestApp.main(TSZIngestApp.scala:29) at com.cbt.ingest.tsz.GenericTSZIngest.main(TSZIngestApp.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$runMain(SparkSubmit.scala:685) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1523) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:86) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:132) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:104) at org.apache.hadoop.hive.ql.metadata.Hive.createMetaStoreClient(Hive.java:3005) at org.apache.hadoop.hive.ql.metadata.Hive.getMSC(Hive.java:3024) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) ... 31 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1521) ... 37 more Caused by: java.lang.IllegalStateException: Error finding hadoop SASL properties at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge23.getHadoopSaslProperties(HadoopThriftAuthBridge23.java:103) at org.apache.hadoop.hive.metastore.MetaStoreUtils.getMetaStoreSaslProperties(MetaStoreUtils.java:1588) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.open(HiveMetaStoreClient.java:401) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:236) at org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient.<init>(SessionHiveMetaStoreClient.java:74) ... 42 more Caused by: java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.thrift.HadoopThriftAuthBridge23.getHadoopSaslProperties(HadoopThriftAuthBridge23.java:98) ... 46 more Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.util.StringUtils.toUpperCase(Ljava/lang/String;)Ljava/lang/String; at org.apache.hadoop.security.SaslPropertiesResolver.setConf(SaslPropertiesResolver.java:69) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.security.SaslPropertiesResolver.getInstance(SaslPropertiesResolver.java:58) ... 51 more 16/09/07 14:21:36 INFO SparkContext: Invoking stop() from shutdown hook

    Expert Contributor

    Is Kerberos enabled on the cluster?

    Explorer

    The problem is resolved by using SQLContext in spark application code. Thanks for quick response.

    Contributor

    Hi @Jay Zhou

    Can you be a bit more specific what you have changed? What did you exactly do with this line?

    val hiveSqlContext = new org.apache.spark.sql.hive.HiveContext(sc)

    I have a similar problem where I get an error

    WARN Hive: Failed to access metastore. This class should not accessed in runtime.

    but this is only when I run the job via Oozie. When I use spark submit the code works so I guess the dependencies are right.

    Do you have any idea what can cause this?

    Explorer

    Actually the problem still exists since I have to use HiveContext. I just noticed ClientWrapper inspected different hadoop versions, first one is correct and the second one is wrong.. highlighted below. Could this be the root cause?

    16/09/07 15:47:54 INFO BlockManagerMasterEndpoint: Registering block manager xxxxxxxx

    16/09/07 15:47:54 INFO HiveContext: Initializing execution hive, version 1.2.1

    16/09/07 15:47:54 INFO ClientWrapper: Inspected Hadoop version: 2.7.1.2.3.4.0-3485

    16/09/07 15:47:54 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.7.1.2.3.4.0-3485

    16/09/07 15:47:54 INFO metastore: Trying to connect to metastore with URI thrift://xxxxxxxxxxxxxxxxxxxxxxxx

    16/09/07 15:47:54 INFO metastore: Connected to metastore.

    16/09/07 15:47:54 INFO SessionState: Created local directory: /tmp/1d43c90d-da99-4970-80fd-31c9ad9a8d4d_resources

    16/09/07 15:47:54 INFO SessionState: Created HDFS directory: /tmp/hive/jzhou5/1d43c90d-da99-4970-80fd-31c9ad9a8d4d

    16/09/07 15:47:54 INFO SessionState: Created local directory: /tmp/jzhou5/1d43c90d-da99-4970-80fd-31c9ad9a8d4d

    16/09/07 15:47:54 INFO SessionState: Created HDFS directory: /tmp/hive/jzhou5/1d43c90d-da99-4970-80fd-31c9ad9a8d4d/_tmp_space.db

    16/09/07 15:47:54 INFO HiveContext: default warehouse location is /user/hive/warehouse

    16/09/07 15:47:54 INFO HiveContext: Initializing HiveMetastoreConnection version 1.2.1 using Spark classes.

    16/09/07 15:47:54 INFO ClientWrapper: Inspected Hadoop version: 2.2.0

    16/09/07 15:47:54 INFO ClientWrapper: Loaded org.apache.hadoop.hive.shims.Hadoop23Shims for Hadoop version 2.2.0

    16/09/07 15:47:55 INFO deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces

    16/09/07 15:47:55 INFO deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize

    I think there is a definitely a clash of versions. The reflection error below indicates a mismatch of versions when the client is creating a session:

    more Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.util.StringUtils.toUpperCase(Ljava/lang/String;)Ljava/lang/String; at org.apache.hadoop.security.SaslPropertiesResolver.setConf(SaslPropertiesResolver.java:69) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.security.SaslPropertiesResolver.getInstance(SaslPropertiesResolver.java:58) ... 54 more 16/09/07 14:21:36 INFO metastore: Trying to connect to metastore with URI thrift://xxxxxxxxxxxxxxxxxxx:9083 Exception in thread "main"

    Check the contents of the jars to make sure they are all compatible. For example what is the contents of target/YOUR_JAR-1.0.0-SNAPSHOT.jar

    Explorer

    I built it in my local windows env. I noticed the hadoop version is 2.2 the jars are automatically downloaded by Maven build. Where to set the version, I didn't see it in my pom.xml

    Here is my partial content of pom.xml

    <properties>

    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>

    <project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>

    <!-- Component versions are defined here -->

    <hadoop.version>2.7.1</hadoop.version>

    <spark.version>1.5.2</spark.version>

    <avro.version>1.8.1</avro.version>

    <log4j.version>1.2.17</log4j.version>

    <scala.version>2.10.6</scala.version>

    </properties>

    <pluginRepositories>

    <pluginRepository>

    <id>scala-tools.org</id>

    <name>Scala-tools Maven2 Repository</name>

    <url>http://scala-tools.org/repo-releases</url>

    </pluginRepository>

    </pluginRepositories>

    <dependencies>

    <dependency>

    <groupId>org.apache.spark</groupId>

    <artifactId>spark-core_2.10</artifactId>

    <version>${spark.version}</version>

    </dependency>

    <dependency>

    <groupId>org.apache.spark</groupId>

    <artifactId>spark-sql_2.10</artifactId>

    <version>${spark.version}</version>

    </dependency>

    <dependency>

    <groupId>com.datastax.spark</groupId>

    <artifactId>spark-cassandra-connector_2.10</artifactId>

    <version>1.5.1</version>

    </dependency>

    <dependency>

    <groupId>org.apache.spark</groupId>

    <artifactId>spark-hive_2.10</artifactId>

    <version>${spark.version}</version>

    </dependency>

    <dependency>

    <groupId>com.databricks</groupId>

    <artifactId>spark-csv_2.10</artifactId>

    <version>1.4.0</version>

    </dependency>

    <dependency>

    <groupId>com.databricks</groupId>

    <artifactId>spark-xml_2.10</artifactId>

    <version>0.3.3</version>

    </dependency>

    <dependency>

    <groupId>com.databricks</groupId>

    <artifactId>spark-avro_2.10</artifactId>

    <version>2.0.1</version>

    </dependency>

    <dependency>

    <groupId>com.google.guava</groupId>

    <artifactId>guava</artifactId>

    <version>18.0</version>

    </dependency>

    <dependency>

    <groupId>org.scalikejdbc</groupId>

    <artifactId>scalikejdbc_2.10</artifactId>

    <version>2.4.2</version>

    </dependency>

    <dependency>

    <groupId>org.apache.spark</groupId>

    <artifactId>spark-mllib_2.10</artifactId>

    <version>${spark.version}</version>

    </dependency>

    <dependency>

    <groupId>org.apache.hive</groupId>

    <artifactId>hive-jdbc</artifactId>

    <version>1.2.1</version>

    </dependency>

    New Contributor

    I agree with @cduby that there is a version conflict between the used hadoop library and what Spark is actually expecting. The best way to find such a problem is to use the dependency:tree ability of Maven in combination with the artifact that contains the problematic class. In this way, you can find which transitive dependencies are getting fetched by your Spark application by default.

    So, I had exactly the same problem and in order to solve it I followed the following process.

    • Find in which artifact the org.apache.hadoop.util.StringUtils class belongs to. This is the hadoop-commons library.
    • Then execute mvn dependency:tree to find out what version of this jar is fetched by default by Spark (note that the automatic dependency resolution happens only in the case that you haven't already provided the hadoop libraries yourself. In my case, these were the 2.2 versions of Apache hadoop-common.
    • Then find the right version of the library that contains the correct version of StringUtils. This can be quite difficult but in my case, I happened to know it from other projects and this was the 2.6.1 version.
    • Provide that dependency in your pom.xml, before the definition of the Spark dependency, so that it takes precedence over the transitive dependency of Spark.
    • Then it should work.

    The following hadoop-common dependency solved the problem for me.

    • <dependency>
      	<groupId>org.apache.hadoop</groupId>
              <artifactId>hadoop-common</artifactId>
              <version>2.6.1</version>
      </dependency>
      <!-- Spark dependencies -->
      <dependency>
      ...

    @Jay Zhou and @Georgios Gkekas Also check out this article on how to use the artifacts in the Hortonworks repository from Maven. It is for building streaming applications but can should be able to translate to other Spark applications:

    https://community.hortonworks.com/articles/30430/a-maven-pomxml-for-java-based-sparkstreaming-appli....

    Explorer

    Thanks. Yes. that is what I did. I have resolved this issue a few weeks ago... sorry to update late.

    Glad you got it working.

    Explorer

    Thanks for helping.

    Take a Tour of the Community
    Don't have an account?
    Your experience may be limited. Sign in to explore more.