Created on 07-11-2016 08:15 AM - edited 09-16-2022 03:29 AM
Hello, I'm trying to run a Spark submit,but I get this error:
WARN scheduler.TaskSetManager: Lost task 0.0 in stage 2.0 (TID 70, totoro.akainix.local): java.lang.AbstractMethodError at org.apache.spark.Logging$class.log(Logging.scala:51) at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60) at org.apache.spark.Logging$class.logInfo(Logging.scala:58) at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60) at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:93) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148) at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:575) at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:565) at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:89) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) ERROR cluster.YarnScheduler: Lost executor 18 on totoro.akainix.local: Container marked as failed: container_1468247436212_0003_01_000019 on host: totoro.akainix.local. Exit status: 50. Diagnostics: Exception from container-launch. Container id: container_1468247436212_0003_01_000019 Exit code: 50 Stack trace: ExitCodeException exitCode=50: at org.apache.hadoop.util.Shell.runCommand(Shell.java:561) at org.apache.hadoop.util.Shell.run(Shell.java:478) at org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:738) at org.apache.hadoop.yarn.server.nodemanager.DefaultContainerExecutor.launchContainer(DefaultContainerExecutor.java:213) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:302) at org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.call(ContainerLaunch.java:82) at java.util.concurrent.FutureTask.run(FutureTask.java:262) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) at java.lang.Thread.run(Thread.java:745) Container exited with a non-zero exit code 50
I'm using Spark 1.6.0 on YARN, and this is the tutorial that I'm following.
Please help me, I'm completely lost.
EDIT: here is more info about the error:
16/07/11 12:32:23 ERROR executor.Executor: Exception in task 0.0 in stage 3.0 (TID 72)
java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:51)
at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60)
at org.apache.spark.Logging$class.logInfo(Logging.scala:58)
at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60)
at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:93)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:575)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:565)
at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
16/07/11 12:32:23 ERROR util.SparkUncaughtExceptionHandler: Uncaught exception in thread Thread[Executor task launch worker-0,5,main]
java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:51)
at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60)
at org.apache.spark.Logging$class.logInfo(Logging.scala:58)
at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60)
at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:93)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:575)
at org.apache.spark.streaming.scheduler.ReceiverTracker$ReceiverTrackerEndpoint$$anonfun$9.apply(ReceiverTracker.scala:565)
at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
at org.apache.spark.SparkContext$$anonfun$38.apply(SparkContext.scala:2003)
at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66)
at org.apache.spark.scheduler.Task.run(Task.scala:89)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
Created 11-06-2016 09:31 PM
I am also having the same problem on quickstart CDH5.8.
Even if I use --jars with the spark-submit I get the same problem.
These jars I am using-
spark-streaming-twitter_2.10-1.6.0-cdh5.8.0.jar
twitter4j-core-4.0.4.jar
twitter4j-stream-4.0.4.jar
error
ver.ReceiverSupervisorImpl: Starting receiver
16/11/06 19:07:15 ERROR executor.Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.NoSuchMethodError: twitter4j.TwitterStream.addListener(Ltwitter4j/StreamListener;)V
at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:72)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.scala:148)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.start(ReceiverSupervisor.scala:130)
at org.apache.spark.st
\
Created 11-07-2016 01:32 AM
It's not the same problem. Here you have put an incompatible version of twitter4j on the classpath.
Created 11-07-2016 07:36 AM
Thanks for your reply.
I have put these entries in build.sbt.
libraryDependencies += "org.twitter4j" % "twitter4j-stream" % "4.0.4"
libraryDependencies += "org.twitter4j" % "twitter4j-core" % "4.0.4"
There is no issue when I run it in Intellij but when I package it and run it in CDH5.8 VM i get the error.
Created 11-07-2016 07:42 AM
I suspect you'll find your version of Spark's examlpe uses twitter4j 3.x. Just don't bundle it yourself. It ought to be in the examples .jar.
Created 11-07-2016 07:58 AM
I have to use twitter4j 4.0.4.
I tried now like this but still got the error.
conf.setJars(Array("/home/cloudera/Documents/lib/twitter/twitter4j-core-4.0.4.jar","/home/cloudera/Documents/lib/twitter/twitter4j-stream-4.0.4.jar"))
twitter4j 3.0.3 is for some reason creating conflict which I am not able to resolve. I do not need 3.0.3.
Created 11-07-2016 11:11 AM
I used these options and the original expection is resolved but now getting below error. Any way to resolve it?
spark.executor.userClassPathFirst true
spark.driver.userClassPathFirst true
g parents
16/11/07 10:59:45 INFO storage.MemoryStore: Block broadcast_0 stored as values in memory (estimated size 2.8 KB, free 2.8 KB)
Exception in thread "dag-scheduler-event-loop" java.lang.UnsatisfiedLinkError: org.xerial.snappy.SnappyNative.maxCompressedLength(I)I
at org.xerial.snappy.SnappyNative.maxCompressedLength(Native Method)
at org.xerial.snappy.Snappy.maxCompressedLength(Snappy.java:316)
at org.xeria
Created 11-09-2016 12:47 PM
I was able to resolve two days back and found that CDH is using old twitter4j jars. Below jars are present at /usr/lib/flume-ng/lib[qucik start vm] which are causing run time conflicts.
twitter4j-core-3.0.3.jar
twitter4j-stream-3.0.3.jar
Created 11-09-2016 08:02 PM
hi
were you able to resolve this problem?
ip-10-0-0-5.ec2.internal, executor 1): java.lang.AbstractMethodError
at org.apache.spark.Logging$class.log(Logging.scala:50)
at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60)
at org.apache.spark.Logging$class.logInfo(Logging.scala:58)
at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60)
at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:96)
at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.s