<?xml version="1.0" encoding="UTF-8"?>
<rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:taxo="http://purl.org/rss/1.0/modules/taxonomy/" version="2.0">
  <channel>
    <title>question Re: job failing in livy - no logs in Archives of Support Questions (Read Only)</title>
    <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/job-failing-in-livy-no-logs/m-p/54797#M60990</link>
    <description>&lt;P&gt;Wondering if anyone has any thoughts on this? I am stumped. Someone suggested it might be the driver running out of memory. I bosted driver mem to 4G without any change. Also still not able to find any logs that indicate the issue. I assume it must be the driver that is generating the error because the Yarn and Spark consider the process as incomplete until it times out.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
    <pubDate>Tue, 16 May 2017 21:08:35 GMT</pubDate>
    <dc:creator>wkupersa</dc:creator>
    <dc:date>2017-05-16T21:08:35Z</dc:date>
    <item>
      <title>job failing in livy - no logs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/job-failing-in-livy-no-logs/m-p/54661#M60988</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I just brought up a new CDH 5 cluster and compiled and installed Livy. &amp;nbsp;I can run jobs using spark-submit and they run via yarn and spark normally.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;Jobs submitted through Livy create the SparkContext (accourding to Jupyter), I can assign things and run transformations, but the jobs die as soon as I try to execute an action. I get an error in jupyter that the SparkContext has been shutdown. &amp;nbsp;The job itself is registered in the Spark History server as having an Executuve Driver added, nothing else.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;There is no mention of the job in the list of Yarn applications. &amp;nbsp;I don't see anything telling in livy.log or the spark-history-server log. &amp;nbsp;Without any entry in Yarn applications, I am not sure where to look to see why it is dying.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;This all runs fine.&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;from pyspark.sql import Row
from pyspark.sql import SQLContext
from pyspark.sql.window import Window
import pyspark.sql.functions as func
sqlc = SQLContext(sc)

row1 = Row(name='willie', number=1)
row2 = Row(name='bob', number=1)
row3 = Row(name='bob', number=3)
row4 = Row(name='willie', number=6)
row5 = Row(name='willie', number=9)
row6 = Row(name='bob', number=12)
row7 = Row(name='willie', number=15)
row8 = Row(name='jon', number=16)
row9 = Row(name='jon', number=17)
df = sqlc.createDataFrame([row1, row2, row3, row4, row5, row6, row7, row8, row9 ])&lt;/PRE&gt;&lt;P&gt;This then dies with the following error.&lt;/P&gt;&lt;PRE&gt;df.count()&lt;/PRE&gt;&lt;P&gt;Any pointers on how to troubleshoot would be appreciated!&lt;/P&gt;&lt;PRE&gt;An error occurred while calling o68.count.
: java.lang.IllegalStateException: SparkContext has been shutdown
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1854)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1875)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1888)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1959)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
	at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166)
	at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
	at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1514)
	at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1514)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:53)
	at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2101)
	at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1513)
	at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1520)
	at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1530)
	at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1529)
	at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2114)
	at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1529)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
	at py4j.Gateway.invoke(Gateway.java:259)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:209)
	at java.lang.Thread.run(Thread.java:745)

Traceback (most recent call last):
  File "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 269, in count
    return int(self._jdf.count())
  File "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 45, in deco
    return f(*a, **kw)
  File "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
    format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o68.count.
: java.lang.IllegalStateException: SparkContext has been shutdown
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1854)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1875)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1888)
	at org.apache.spark.SparkContext.runJob(SparkContext.scala:1959)
	at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
	at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
	at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
	at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
	at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166)
	at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
	at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1514)
	at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1514)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:53)
	at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2101)
	at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1513)
	at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1520)
	at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1530)
	at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1529)
	at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2114)
	at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1529)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:606)
	at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
	at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
	at py4j.Gateway.invoke(Gateway.java:259)
	at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
	at py4j.commands.CallCommand.execute(CallCommand.java:79)
	at py4j.GatewayConnection.run(GatewayConnection.java:209)
	at java.lang.Thread.run(Thread.java:745)&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 16 Sep 2022 11:35:43 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/job-failing-in-livy-no-logs/m-p/54661#M60988</guid>
      <dc:creator>wkupersa</dc:creator>
      <dc:date>2022-09-16T11:35:43Z</dc:date>
    </item>
    <item>
      <title>Re: job failing in livy - no logs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/job-failing-in-livy-no-logs/m-p/54745#M60989</link>
      <description>&lt;P&gt;I figured out part of this. &amp;nbsp;I needed to set&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;PRE&gt;livy.spark.master = yarn&lt;/PRE&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;With that set, the job does appear in yarn. &amp;nbsp;It is still dying prematurely when I run it through livy, and the yarn logs look happy, so I am not sure what is going on there. But at least that is something.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Mon, 15 May 2017 20:27:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/job-failing-in-livy-no-logs/m-p/54745#M60989</guid>
      <dc:creator>wkupersa</dc:creator>
      <dc:date>2017-05-15T20:27:53Z</dc:date>
    </item>
    <item>
      <title>Re: job failing in livy - no logs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/job-failing-in-livy-no-logs/m-p/54797#M60990</link>
      <description>&lt;P&gt;Wondering if anyone has any thoughts on this? I am stumped. Someone suggested it might be the driver running out of memory. I bosted driver mem to 4G without any change. Also still not able to find any logs that indicate the issue. I assume it must be the driver that is generating the error because the Yarn and Spark consider the process as incomplete until it times out.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Tue, 16 May 2017 21:08:35 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/job-failing-in-livy-no-logs/m-p/54797#M60990</guid>
      <dc:creator>wkupersa</dc:creator>
      <dc:date>2017-05-16T21:08:35Z</dc:date>
    </item>
    <item>
      <title>Re: job failing in livy - no logs</title>
      <link>https://community.cloudera.com/t5/Archives-of-Support-Questions/job-failing-in-livy-no-logs/m-p/54893#M60991</link>
      <description>&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;For the next poor schlub that encouters this weird behavior, I figured out a workaround which also helps me to pinpoint the problem.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;It turns out that the problem was with SqlContext. I realized that my SparkContext could create and manipulate RDD's all day without problem. The SQlContext however would not allow me to work with dataframes without an error.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I found that if I stopped my SparkContext, created a new one, and then created a new SqlContext from that, everything worked fine. This leads me to believe that there was something going on with the SparkContext that I was being passed from SparkMagic.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;I've updated to Spark2 now and I don't seem to be having any troubles that I have seen yet with the SparkSession so I doubt I will be digging more into this.&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;&lt;P&gt;&amp;nbsp;&lt;/P&gt;</description>
      <pubDate>Fri, 19 May 2017 17:54:53 GMT</pubDate>
      <guid>https://community.cloudera.com/t5/Archives-of-Support-Questions/job-failing-in-livy-no-logs/m-p/54893#M60991</guid>
      <dc:creator>wkupersa</dc:creator>
      <dc:date>2017-05-19T17:54:53Z</dc:date>
    </item>
  </channel>
</rss>

