Created on 04-19-2018 11:16 AM - edited 09-16-2022 06:07 AM
I tried to upgrade Spark from 2.2 to 2.3 and got an error. It has something to do with the lineage file missing. So, the SparkContext could not be initialized. I rolled back to CDS 2.2 release 2. Does anyone have a way to fix this?
Thanks.
Created 04-27-2018 09:17 AM
I get the same error as you. Did you solve this?
Created 04-27-2018 09:19 AM
Created 05-01-2018 07:22 PM
Thanks for reporting. Care to share the full error for the lineage file missing, please? I quickly tested an upgrade from 2.2 to 2.3 but didn't hit this. A full error stack trace would certainly help.
Created 05-01-2018 07:48 PM
Here is the full stack when I try to launch spark-shell.
18/05/02 02:47:37 ERROR spark.SparkContext: Error initializing SparkContext. org.apache.spark.SparkException: Exception when registering SparkListener at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2364) at org.apache.spark.SparkContext.<init>(SparkContext.scala:553) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:103) at $line3.$read$$iw$$iw.<init>(<console>:15) at $line3.$read$$iw.<init>(<console>:43) at $line3.$read.<init>(<console>:45) at $line3.$read$.<init>(<console>:49) at $line3.$read$.<clinit>(<console>) at $line3.$eval$.$print$lzycompute(<console>:7) at $line3.$eval$.$print(<console>:6) at $line3.$eval.$print(<console>) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at scala.tools.nsc.interpreter.IMain$ReadEvalPrint.call(IMain.scala:786) at scala.tools.nsc.interpreter.IMain$Request.loadAndRun(IMain.scala:1047) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:638) at scala.tools.nsc.interpreter.IMain$WrappedRequest$$anonfun$loadAndRunReq$1.apply(IMain.scala:637) at scala.reflect.internal.util.ScalaClassLoader$class.asContext(ScalaClassLoader.scala:31) at scala.reflect.internal.util.AbstractFileClassLoader.asContext(AbstractFileClassLoader.scala:19) at scala.tools.nsc.interpreter.IMain$WrappedRequest.loadAndRunReq(IMain.scala:637) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:569) at scala.tools.nsc.interpreter.IMain.interpret(IMain.scala:565) at scala.tools.nsc.interpreter.ILoop.interpretStartingWith(ILoop.scala:807) at scala.tools.nsc.interpreter.ILoop.command(ILoop.scala:681) at scala.tools.nsc.interpreter.ILoop.processLine(ILoop.scala:395) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$2.apply(SparkILoop.scala:79) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1$$anonfun$apply$mcV$sp$1$$anonfun$apply$mcV$sp$2.apply(SparkILoop.scala:79) at scala.collection.immutable.List.foreach(List.scala:381) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(SparkILoop.scala:79) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1$$anonfun$apply$mcV$sp$1.apply(SparkILoop.scala:79) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1$$anonfun$apply$mcV$sp$1.apply(SparkILoop.scala:79) at scala.tools.nsc.interpreter.ILoop.savingReplayStack(ILoop.scala:91) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply$mcV$sp(SparkILoop.scala:78) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:78) at org.apache.spark.repl.SparkILoop$$anonfun$initializeSpark$1.apply(SparkILoop.scala:78) at scala.tools.nsc.interpreter.IMain.beQuietDuring(IMain.scala:214) at org.apache.spark.repl.SparkILoop.initializeSpark(SparkILoop.scala:77) at org.apache.spark.repl.SparkILoop.loadFiles(SparkILoop.scala:110) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply$mcZ$sp(ILoop.scala:920) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.tools.nsc.interpreter.ILoop$$anonfun$process$1.apply(ILoop.scala:909) at scala.reflect.internal.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:97) at scala.tools.nsc.interpreter.ILoop.process(ILoop.scala:909) at org.apache.spark.repl.Main$.doMain(Main.scala:76) at org.apache.spark.repl.Main$.main(Main.scala:56) at org.apache.spark.repl.Main.main(Main.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:497) at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:892) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:197) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:227) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:136) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: java.io.FileNotFoundException: Lineage directory /var/log/spark2/lineage doesn't exist or is not writable. at com.cloudera.spark.lineage.LineageWriter$.checkLineageConfig(LineageWriter.scala:158) at com.cloudera.spark.lineage.NavigatorAppListener.<init>(ClouderaNavigatorListener.scala:30) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2740) at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2732) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2732) at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2353) at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2352) at scala.Option.foreach(Option.scala:257) at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2352) ... 62 more org.apache.spark.SparkException: Exception when registering SparkListener at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2364) at org.apache.spark.SparkContext.<init>(SparkContext.scala:553) at org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930) at org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921) at scala.Option.getOrElse(Option.scala:121) at org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921) at org.apache.spark.repl.Main$.createSparkSession(Main.scala:103) ... 55 elided Caused by: java.io.FileNotFoundException: Lineage directory /var/log/spark2/lineage doesn't exist or is not writable. at com.cloudera.spark.lineage.LineageWriter$.checkLineageConfig(LineageWriter.scala:158) at com.cloudera.spark.lineage.NavigatorAppListener.<init>(ClouderaNavigatorListener.scala:30) at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) at java.lang.reflect.Constructor.newInstance(Constructor.java:422) at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2740) at org.apache.spark.util.Utils$$anonfun$loadExtensions$1.apply(Utils.scala:2732) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) at scala.collection.mutable.ArraySeq.foreach(ArraySeq.scala:74) at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) at org.apache.spark.util.Utils$.loadExtensions(Utils.scala:2732) at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2353) at org.apache.spark.SparkContext$$anonfun$setupAndStartListenerBus$1.apply(SparkContext.scala:2352) at scala.Option.foreach(Option.scala:257) at org.apache.spark.SparkContext.setupAndStartListenerBus(SparkContext.scala:2352) ... 62 more
Hope this helps.
Cheers,
Ben
Created 05-01-2018 11:54 PM
Thanks @Benassi10 for providing the context. Much appreciated.
We are discussing this internally to see what can cause such issues. One theory is that we enabled support for Spark Lineage in CDS 2.3 and if the cm-agent doesn't create /var/log/spar2/lineage directory (for some reasons) you can see this behaviour. If lineage is not important, can you try running the shell with lineage disabled?
spark2-shell --conf spark.lineage.enabled=false
If you don't want to disable lineage, another workaround would be to change the lineage directory to /tmp in CM > Spark2 > Configuration > GATEWAY Lineage Log Directory > /tmp , followed by redeploying the client configuration.
Let us know if the above helps. I will update the thread once I have more information on the fix.
Created 05-02-2018 07:20 AM
After I changed the directory to / tmp, I verified that spark 2.3 works normally.
Is there any possibility of a new release of Spark 2.3?
Created 05-02-2018 07:32 AM
Thanks, Lucas. That's great to hear!
Can you please check if toggling it back to /var/log/spark2/lineage followed by redeploying the client configuration helps too?
As promised, once the fix is identified I will update this thread.
Created on 05-02-2018 08:47 AM - edited 05-02-2018 09:03 AM
I got it to work too by changing the directory to /tmp, but when I changed it back to /var/log/spark2/lineage, the error came back. So, I created the directory manually. I modified the spark2 and lineage directories to be owned by spark:spark and modified the lineage directory to be writeable by all (rwxrwxrwx) with a sticky bit (t). After doing this, the error goes away.
Created 05-02-2018 08:18 PM
Cool. I will feed it back in the internal Jira we are discussing this issue for.
Thx for sharing.