Created on 09-06-2016 10:25 PM - edited 09-16-2022 03:38 AM
Hello,
I am running a sort job through spark-submit. I've also written a custom compression codec (instead of the standard lzf or snappy). The custom codec is packaged in a jar file. My goal is to have spark use my custom codec for compression.
For the first couple of compression operations, my codec is indeed used (I see debug messages). Later, there is an exception thrown (java.lang.NoSuchMethodException, stating that my Java class isn't found).
I'm using YARN. My jar file is on the master node. It's path is specified in the /etc/spark/conf/classpath.txt.
Any ideas on why it's not being found (and only sometimes)? Perhaps I ought to specify the jar file's location in some other way?
Your suggestions please.
Thanks.
Created 09-06-2016 10:36 PM
The proper way to add custom jar to classpath is using the "--jar" option with spark-submit:
--jars JARS Comma-separated list of local jars to include on the driver and executor classpaths.
Created 09-06-2016 11:40 PM
Thank you for your reply.
I tried out your suggestion.
spark-submit now includes a --jars line, specifying the local path of the custom jar file on the master node.
With this change, the job continues to behave as earlier. It successfully finds the class in the custom jar file for the first couple of invocations, and later throws a java.lang.NoSuchMethodException.
It feels like there is something simple I'm missing?
Thanks.
Created 09-07-2016 01:00 AM
--jars isn't quite relevant here, as it will just put classes in the same classloader as if you'd packaged the classes with your app. The key is that there are many classloaders at play here and only some can see user classes. IIRC codec classes are the most problematic because they're necessary within Spark. You should post more details about the failure. Also try the "user classpath first" options.
Created 09-07-2016 03:01 AM
Thanks for your message. It definitely seems like different things are at play here.
Here's the stack trace of the failure.
User class threw exception: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 7, nje-amazon4): java.io.IOException: java.lang.NoSuchMethodException: com.test.new_codec.<init>(org.apache.spark.SparkConf)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1178)
at org.apache.spark.broadcast.TorrentBroadcast.readBroadcastBlock(TorrentBroadcast.scala:165)
at org.apache.spark.broadcast.TorrentBroadcast._value$lzycompute(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast._value(TorrentBroadcast.scala:64)
at org.apache.spark.broadcast.TorrentBroadcast.getValue(TorrentBroadcast.scala:88)
at org.apache.spark.broadcast.Broadcast.value(Broadcast.scala:70)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:65)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.run(Task.scala:88)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NoSuchMethodException: com.test.new_codec.<init>(org.apache.spark.SparkConf)
at java.lang.Class.getConstructor0(Class.java:2849)
at java.lang.Class.getConstructor(Class.java:1718)
at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:66)
at org.apache.spark.io.CompressionCodec$.createCodec(CompressionCodec.scala:60)
at org.apache.spark.broadcast.TorrentBroadcast.org$apache$spark$broadcast$TorrentBroadcast$$setConf(TorrentBroadcast.scala:73)
at org.apache.spark.broadcast.TorrentBroadcast$$anonfun$readBroadcastBlock$1.apply(TorrentBroadcast.scala:167)
at org.apache.spark.util.Utils$.tryOrIOException(Utils.scala:1175)
... 12 more
Thanks.
Created 09-07-2016 03:10 AM
The corresponding driver stack trace is:
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1294)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1282)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1281)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1281)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:697)
at scala.Option.foreach(Option.scala:236)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:697)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1507)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1469)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1458)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:567)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1824)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1837)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1850)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:905)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.RDD.collect(RDD.scala:904)
at org.apache.spark.RangePartitioner$.sketch(Partitioner.scala:264)
at org.apache.spark.RangePartitioner.<init>(Partitioner.scala:126)
at org.apache.spark.rdd.OrderedRDDFunctions$$anonfun$sortByKey$1.apply(OrderedRDDFunctions.scala:62)
at org.apache.spark.rdd.OrderedRDDFunctions$$anonfun$sortByKey$1.apply(OrderedRDDFunctions.scala:61)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:306)
at org.apache.spark.rdd.OrderedRDDFunctions.sortByKey(OrderedRDDFunctions.scala:61)
at com.github.ehiggs.spark.terasort.TeraSort$.main(TeraSort.scala:79)
at com.github.ehiggs.spark.terasort.TeraSort.main(TeraSort.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:672)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Created 09-07-2016 04:02 AM
I must add that the job runs through successfully to completion, when I run in standalone mode. i.e.
spark.master=local[*]
Created 09-07-2016 04:05 AM
That one's actually easy. As it says your codec doesn't have a constructor accepting a SparkConf.
Created 09-07-2016 04:08 AM
Actually, it does indeed have such a constructor. What's strange is that it isn't visible at this point.
As I mentioned earlier, the codec is successfully instantiated the first couple of times (with that same constructor, as there's only one in that class).
Thanks again for your quick responses.
Created 09-10-2016 09:33 AM
I was able to fix this, by changing the name of the Java class to something else. i.e. once I renamed the Java class, and recreated the Jar file, things worked smoothly.
If I recreate the Jar file with the original named Java class, the failure is again seen.
I'm not yet sure how to explain this. Perhaps there is an older version of my Jar file that is conflicting with this name. I couldn't find one in any of the folders that I looked in.
Although I have a working solution, I'm still curious about the reason for the failure. Your tips welcome.
Thanks.