- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Spark job failed when new HiveContext object
- Labels:
-
Apache Hive
-
Apache Spark
Created 09-07-2016 03:24 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
We are using HDP 2.3.4. I also followed the instructions below.
Here is the callstack:
16/09/06 15:20:35 WARN Hive: Failed to access metastore. This class should not accessed in runtime. org.apache.hadoop.hive.ql.metadata.HiveException: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.ql.metadata.SessionHiveMetaStoreClient at org.apache.hadoop.hive.ql.metadata.Hive.getAllDatabases(Hive.java:1236) at org.apache.hadoop.hive.ql.metadata.Hive.reloadFunctions(Hive.java:174) at org.apache.hadoop.hive.ql.metadata.Hive.<clinit>(Hive.java:166) at org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:503) at org.apache.spark.sql.hive.client.ClientWrapper.<init>(ClientWrapper.scala:193) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:57) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:526) at org.apache.spark.sql.hive.client.IsolatedClientLoader.liftedTree1$1(IsolatedClientLoader.scala:183) at org.apache.spark.sql.hive.client.IsolatedClientLoader.<init>(IsolatedClientLoader.scala:179) at org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:228) at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:187) at org.apache.spark.sql.hive.HiveContext.setConf(HiveContext.scala:394) at org.apache.spark.sql.hive.HiveContext.defaultOverrides(HiveContext.scala:176) at org.apache.spark.sql.hive.HiveContext.<init>(HiveContext.scala:179) at com.cbt.ingest.tsz.TSZIngestApp$delayedInit$body.apply(TSZIngestApp.scala:50) at scala.Function0$class.apply$mcV$sp(Function0.scala:40) at scala.runtime.AbstractFunction0.apply$mcV$sp(AbstractFunction0.scala:12) at scala.App$anonfun$main$1.apply(App.scala:71) at scala.App$anonfun$main$1.apply(App.scala:71) at scala.collection.immutable.List.foreach(List.scala:318) at scala.collection.generic.TraversableForwarder$class.foreach(TraversableForwarder.scala:32)
Created 09-07-2016 09:52 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
The problem is resolved by using SQLContext in spark application code. Thanks for quick response.
Created 09-09-2016 04:30 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I think there is a definitely a clash of versions. The reflection error below indicates a mismatch of versions when the client is creating a session:
more Caused by: java.lang.NoSuchMethodError: org.apache.hadoop.util.StringUtils.toUpperCase(Ljava/lang/String;)Ljava/lang/String; at org.apache.hadoop.security.SaslPropertiesResolver.setConf(SaslPropertiesResolver.java:69) at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:73) at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:133) at org.apache.hadoop.security.SaslPropertiesResolver.getInstance(SaslPropertiesResolver.java:58) ... 54 more 16/09/07 14:21:36 INFO metastore: Trying to connect to metastore with URI thrift://xxxxxxxxxxxxxxxxxxx:9083 Exception in thread "main"
Check the contents of the jars to make sure they are all compatible. For example what is the contents of target/YOUR_JAR-1.0.0-SNAPSHOT.jar
Created 09-09-2016 07:34 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I built it in my local windows env. I noticed the hadoop version is 2.2 the jars are automatically downloaded by Maven build. Where to set the version, I didn't see it in my pom.xml
Here is my partial content of pom.xml
<properties>
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
<!-- Component versions are defined here -->
<hadoop.version>2.7.1</hadoop.version>
<spark.version>1.5.2</spark.version>
<avro.version>1.8.1</avro.version>
<log4j.version>1.2.17</log4j.version>
<scala.version>2.10.6</scala.version>
</properties>
<pluginRepositories>
<pluginRepository>
<id>scala-tools.org</id>
<name>Scala-tools Maven2 Repository</name>
<url>http://scala-tools.org/repo-releases</url>
</pluginRepository>
</pluginRepositories>
<dependencies>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-core_2.10</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-sql_2.10</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>com.datastax.spark</groupId>
<artifactId>spark-cassandra-connector_2.10</artifactId>
<version>1.5.1</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-hive_2.10</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-csv_2.10</artifactId>
<version>1.4.0</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-xml_2.10</artifactId>
<version>0.3.3</version>
</dependency>
<dependency>
<groupId>com.databricks</groupId>
<artifactId>spark-avro_2.10</artifactId>
<version>2.0.1</version>
</dependency>
<dependency>
<groupId>com.google.guava</groupId>
<artifactId>guava</artifactId>
<version>18.0</version>
</dependency>
<dependency>
<groupId>org.scalikejdbc</groupId>
<artifactId>scalikejdbc_2.10</artifactId>
<version>2.4.2</version>
</dependency>
<dependency>
<groupId>org.apache.spark</groupId>
<artifactId>spark-mllib_2.10</artifactId>
<version>${spark.version}</version>
</dependency>
<dependency>
<groupId>org.apache.hive</groupId>
<artifactId>hive-jdbc</artifactId>
<version>1.2.1</version>
</dependency>
Created 09-26-2016 04:06 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
I agree with @cduby that there is a version conflict between the used hadoop library and what Spark is actually expecting. The best way to find such a problem is to use the dependency:tree ability of Maven in combination with the artifact that contains the problematic class. In this way, you can find which transitive dependencies are getting fetched by your Spark application by default.
So, I had exactly the same problem and in order to solve it I followed the following process.
- Find in which artifact the org.apache.hadoop.util.StringUtils class belongs to. This is the hadoop-commons library.
- Then execute mvn dependency:tree to find out what version of this jar is fetched by default by Spark (note that the automatic dependency resolution happens only in the case that you haven't already provided the hadoop libraries yourself. In my case, these were the 2.2 versions of Apache hadoop-common.
- Then find the right version of the library that contains the correct version of StringUtils. This can be quite difficult but in my case, I happened to know it from other projects and this was the 2.6.1 version.
- Provide that dependency in your pom.xml, before the definition of the Spark dependency, so that it takes precedence over the transitive dependency of Spark.
- Then it should work.
The following hadoop-common dependency solved the problem for me.
<dependency> <groupId>org.apache.hadoop</groupId> <artifactId>hadoop-common</artifactId> <version>2.6.1</version> </dependency> <!-- Spark dependencies --> <dependency> ...
Created 09-26-2016 04:29 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@Jay Zhou and @Georgios Gkekas Also check out this article on how to use the artifacts in the Hortonworks repository from Maven. It is for building streaming applications but can should be able to translate to other Spark applications:
Created 09-26-2016 04:31 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks. Yes. that is what I did. I have resolved this issue a few weeks ago... sorry to update late.
Created 09-26-2016 04:33 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Glad you got it working.
Created 09-26-2016 06:21 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Thanks for helping.
- « Previous
-
- 1
- 2
- Next »