Reply
Highlighted
New Contributor
Posts: 2
Registered: ‎07-26-2016

Spark Job Failing on CDH 5.13.0

Hi Guys,

We have just updated from CDH 5.4.4 to CDH 5.13.0 , and we are facing in running a spark job on yarn cluster.

This tends to relate to something with jets3t version mismatch, with spark and hadoop dependencies on the cluster at runtime, while submitting the job via Hue.

 

2018-01-09 15:18:36,944 [broadcast-hash-join-1] ERROR org.apache.hadoop.yarn.YarnUncaughtExceptionHandler  - Thread Thread[broadcast-hash-join-1,5,main] threw an Error.  Shutting down now...
java.lang.VerifyError: Bad type on operand stack
Exception Details:
  Location:
    org/apache/hadoop/fs/s3native/Jets3tNativeFileSystemStore.copy(Ljava/lang/String;Ljava/lang/String;)V @152: invokevirtual
  Reason:
    Type 'org/jets3t/service/model/S3Object' (current frame, stack[4]) is not assignable to 'org/jets3t/service/model/StorageObject'
  Current Frame:
    bci: @152
    flags: { }
    locals: { 'org/apache/hadoop/fs/s3native/Jets3tNativeFileSystemStore', 'java/lang/String', 'java/lang/String', 'org/jets3t/service/model/S3Object' }
    stack: { 'org/jets3t/service/S3Service', 'java/lang/String', 'java/lang/String', 'java/lang/String', 'org/jets3t/service/model/S3Object', integer }
  Bytecode:
    0000000: b200 3fb9 0065 0100 9900 36b2 003f bb00
    0000010: 5759 b700 5812 66b6 0059 2bb6 0059 1267
    0000020: b600 592c b600 5912 68b6 0059 2ab4 0021
    0000030: b600 3ab6 0059 b600 5ab9 0069 0200 2ab4
    0000040: 0010 9900 302a b400 0b2a b400 212b 0101
    0000050: 0101 b600 6a4e 2ab4 001a 0994 9e00 162d
    0000060: b600 6b2a b400 1a94 9e00 0a2a 2d2c b600
    0000070: 6cb1 bb00 2859 2cb7 0029 4e2d 2ab4 001d
    0000080: b600 2e2a b400 0b2a b400 21b6 003a 2b2a
    0000090: b400 21b6 003a 2d03 b600 6d57 a700 0a4e
    00000a0: 2a2d 2bb7 0033 b1                      
  Exception Handler Table:
    bci [0, 113] => handler: 159
    bci [114, 156] => handler: 159
  Stackmap Table:
    same_frame(@62)
    same_frame(@114)
    same_locals_1_stack_item_frame(@159,Object[#214])
    same_frame(@166)

	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.createDefaultStore(NativeS3FileSystem.java:342)
	at org.apache.hadoop.fs.s3native.NativeS3FileSystem.initialize(NativeS3FileSystem.java:332)
	at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2816)
	at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:98)
	at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:2853)
	at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:2835)
	at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:387)
	at org.apache.hadoop.fs.Path.getFileSystem(Path.java:296)
	at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:258)
	at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:229)
	at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:315)
	at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:202)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
	at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
	at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
	at scala.Option.getOrElse(Option.scala:120)
	at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
	at org.apache.spark.ShuffleDependency.<init>(Dependency.scala:91)
	at org.apache.spark.sql.execution.Exchange.prepareShuffleDependency(Exchange.scala:220)
	at org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:254)
	at org.apache.spark.sql.execution.Exchange$$anonfun$doExecute$1.apply(Exchange.scala:248)
	at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48)
	at org.apache.spark.sql.execution.Exchange.doExecute(Exchange.scala:247)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.apply(TungstenAggregate.scala:86)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregate$$anonfun$doExecute$1.apply(TungstenAggregate.scala:80)
	at org.apache.spark.sql.catalyst.errors.package$.attachTree(package.scala:48)
	at org.apache.spark.sql.execution.aggregate.TungstenAggregate.doExecute(TungstenAggregate.scala:80)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:132)
	at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$5.apply(SparkPlan.scala:130)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:130)
	at org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadcastFuture$1$$anonfun$apply$1.apply(BroadcastHashOuterJoin.scala:85)
	at org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadcastFuture$1$$anonfun$apply$1.apply(BroadcastHashOuterJoin.scala:82)
	at org.apache.spark.sql.execution.SQLExecution$.withExecutionId(SQLExecution.scala:90)
	at org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadcastFuture$1.apply(BroadcastHashOuterJoin.scala:82)
	at org.apache.spark.sql.execution.joins.BroadcastHashOuterJoin$$anonfun$broadcastFuture$1.apply(BroadcastHashOuterJoin.scala:82)
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.liftedTree1$1(Future.scala:24)
	at scala.concurrent.impl.Future$PromiseCompletingRunnable.run(Future.scala:24)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
	at java.lang.Thread.run(Thread.java:745)

I am stuck with this for sometime now, and any help or possible direction would be appreciated.

Announcements