Member since
01-14-2017
17
Posts
0
Kudos Received
3
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
8183 | 05-19-2017 10:54 AM | |
14137 | 05-17-2017 03:05 PM | |
1140 | 05-15-2017 09:35 AM |
05-19-2017
10:54 AM
For the next poor schlub that encouters this weird behavior, I figured out a workaround which also helps me to pinpoint the problem. It turns out that the problem was with SqlContext. I realized that my SparkContext could create and manipulate RDD's all day without problem. The SQlContext however would not allow me to work with dataframes without an error. I found that if I stopped my SparkContext, created a new one, and then created a new SqlContext from that, everything worked fine. This leads me to believe that there was something going on with the SparkContext that I was being passed from SparkMagic. I've updated to Spark2 now and I don't seem to be having any troubles that I have seen yet with the SparkSession so I doubt I will be digging more into this.
... View more
05-17-2017
03:05 PM
I was able to add it as a service after activating the parcel AND downloading the csd jar. I didn't realize that I needed both of these. I thought it was either or.
... View more
05-16-2017
02:08 PM
Wondering if anyone has any thoughts on this? I am stumped. Someone suggested it might be the driver running out of memory. I bosted driver mem to 4G without any change. Also still not able to find any logs that indicate the issue. I assume it must be the driver that is generating the error because the Yarn and Spark consider the process as incomplete until it times out.
... View more
05-15-2017
02:32 PM
Thanks Bill, I just built this cluster using CDH 5.11.0. I installed Spark1.6.0 thought the wizard along with Yarn, ZooKeeper, and HDFS. I verified that Spark 1.6.0 worked and later added Hive as well. I added the parcel configuration for Spark 2. I downloaded it, distributed it, and activated it. It appears as distributed and anctivated under parcels. I then expected to see it in the list of services that I could add under the cluster's "Add Service" option, but I don't. I turned off "Validate Parcel Relations" to see if that would cause it to appear, but it didn't.
... View more
05-15-2017
02:06 PM
Went there, activated it, but I still don't see it as a choice when I Add Service.
... View more
05-15-2017
01:27 PM
I figured out part of this. I needed to set livy.spark.master = yarn With that set, the job does appear in yarn. It is still dying prematurely when I run it through livy, and the yarn logs look happy, so I am not sure what is going on there. But at least that is something.
... View more
05-15-2017
09:35 AM
I figured out what was causing this. One of the repo sites that i had configured was not being let through my proxy. Once I opened up the proxy to that repo site, the error went away. --Willie
... View more
05-12-2017
01:23 PM
Hello,
I have a cluster that is not able to use its proxy (That's a separate post). In order to install SPARK2, I have attempted to follow the instructions here: https://www.cloudera.com/documentation/spark2/latest/topics/spark2_installing.html
I put the SPARK2 parcel into the /opt/cloudera/csd directory. After restarting the scm-server, I was able to distribute and activate CSD. However, SPARK2 does not appear as an option under "Add a Service"
The only related errors I see is an error in the scm-server log indicating that it failed to load CSD from /opt/cloudera/csd because it has no jar extension. Although I think this is a spurious error because the package has already been distrivuted and activate, I did rename it with a jar extension. Upon restarting, I got an error idicating that there was a ZipException when it tried to uncompress it. The file does appear to be a valid gzipped tar archive.
Any ideas? Thanks!
... View more
Labels:
- Labels:
-
Apache Spark
-
Cloudera Manager
05-12-2017
09:17 AM
Hello, Running CDH 5.11.0. I have my proxy set in network settings, however I am getting this error whenever I try to check for new parcels. 2017-05-12 12:14:08,077 WARN ParcelUpdateService:com.cloudera.cmf.persist.ReadWriteDatabaseTaskCallable: Error while executing CmfEntityManager task
java.util.MissingFormatArgumentException: Format specifier 's'
at java.util.Formatter.format(Formatter.java:2487)
at java.util.Formatter.format(Formatter.java:2423)
at java.lang.String.format(String.java:2790)
at com.cloudera.parcel.components.ParcelDownloaderImpl.syncRemoteRepos(ParcelDownloaderImpl.java:359)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:439)
at com.cloudera.parcel.components.ParcelDownloaderImpl$1.run(ParcelDownloaderImpl.java:434)
at com.cloudera.cmf.persist.ReadWriteDatabaseTaskCallable.call(ReadWriteDatabaseTaskCallable.java:36)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
... View more
Labels:
- Labels:
-
Cloudera Manager
05-11-2017
02:43 PM
I just brought up a new CDH 5 cluster and compiled and installed Livy. I can run jobs using spark-submit and they run via yarn and spark normally. Jobs submitted through Livy create the SparkContext (accourding to Jupyter), I can assign things and run transformations, but the jobs die as soon as I try to execute an action. I get an error in jupyter that the SparkContext has been shutdown. The job itself is registered in the Spark History server as having an Executuve Driver added, nothing else. There is no mention of the job in the list of Yarn applications. I don't see anything telling in livy.log or the spark-history-server log. Without any entry in Yarn applications, I am not sure where to look to see why it is dying. This all runs fine. from pyspark.sql import Row
from pyspark.sql import SQLContext
from pyspark.sql.window import Window
import pyspark.sql.functions as func
sqlc = SQLContext(sc)
row1 = Row(name='willie', number=1)
row2 = Row(name='bob', number=1)
row3 = Row(name='bob', number=3)
row4 = Row(name='willie', number=6)
row5 = Row(name='willie', number=9)
row6 = Row(name='bob', number=12)
row7 = Row(name='willie', number=15)
row8 = Row(name='jon', number=16)
row9 = Row(name='jon', number=17)
df = sqlc.createDataFrame([row1, row2, row3, row4, row5, row6, row7, row8, row9 ]) This then dies with the following error. df.count() Any pointers on how to troubleshoot would be appreciated! An error occurred while calling o68.count.
: java.lang.IllegalStateException: SparkContext has been shutdown
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1854)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1875)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1888)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1959)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166)
at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1514)
at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1514)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:53)
at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2101)
at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1513)
at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1520)
at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1530)
at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1529)
at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2114)
at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1529)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:745)
Traceback (most recent call last):
File "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/spark/python/lib/pyspark.zip/pyspark/sql/dataframe.py", line 269, in count
return int(self._jdf.count())
File "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/spark/python/lib/py4j-0.9-src.zip/py4j/java_gateway.py", line 813, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 45, in deco
return f(*a, **kw)
File "/opt/cloudera/parcels/CDH-5.11.0-1.cdh5.11.0.p0.34/lib/spark/python/lib/py4j-0.9-src.zip/py4j/protocol.py", line 308, in get_return_value
format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling o68.count.
: java.lang.IllegalStateException: SparkContext has been shutdown
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1854)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1875)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1888)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1959)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:927)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:150)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:111)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:316)
at org.apache.spark.rdd.RDD.collect(RDD.scala:926)
at org.apache.spark.sql.execution.SparkPlan.executeCollect(SparkPlan.scala:166)
at org.apache.spark.sql.execution.SparkPlan.executeCollectPublic(SparkPlan.scala:174)
at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1514)
at org.apache.spark.sql.DataFrame$$anonfun$org$apache$spark$sql$DataFrame$$execute$1$1.apply(DataFrame.scala:1514)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:53)
at org.apache.spark.sql.DataFrame.withNewExecutionId(DataFrame.scala:2101)
at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$execute$1(DataFrame.scala:1513)
at org.apache.spark.sql.DataFrame.org$apache$spark$sql$DataFrame$$collect(DataFrame.scala:1520)
at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1530)
at org.apache.spark.sql.DataFrame$$anonfun$count$1.apply(DataFrame.scala:1529)
at org.apache.spark.sql.DataFrame.withCallback(DataFrame.scala:2114)
at org.apache.spark.sql.DataFrame.count(DataFrame.scala:1529)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:231)
at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:381)
at py4j.Gateway.invoke(Gateway.java:259)
at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133)
at py4j.commands.CallCommand.execute(CallCommand.java:79)
at py4j.GatewayConnection.run(GatewayConnection.java:209)
at java.lang.Thread.run(Thread.java:745)
... View more
Labels:
- Labels:
-
Apache Spark
04-03-2017
02:23 PM
I tried this and while my job is still running, it looks like it has gotten farther than it has in the past. Thanks!
... View more
03-31-2017
09:45 PM
I am getting garbage collection errors: "java.lang.OutOfMemoryError: GC overhead limit exceeded" Everything that I have read points to heap size. I have upped all the heap related parameters that I see in my yarn configuration options. I am trying to run spark-submit with the argument --driver-java-options " -Xmx2048m" and I get the error "Initial heap size set to a larger value than the maximum heap size" I am not sure why it says that the maximum heap size is smaller than 2G? I am not sure what else to look at. Thanks!
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN
03-08-2017
02:55 PM
A bit more info.... (and this is cross-posted in project jupyter list) I think that messaging is getting screwed up between Pyspark and Livy. When the last cell is executed, I will see this on the client side. 2017-03-08 22:24:48,505 INFO EventsHandler InstanceId: 0e1c8fd2-047e-4337-b264-5b64ba74de5a,EventName: notebookStatementExecutionStart,Timestamp: 2017-03-08 22:24:48.504920,SessionGuid: 03d14478-6adc-4b 34-abef-b9b6fd400543,LivyKind: pyspark,SessionId: 8,StatementGuid: f1933b11-b767-4a18-b311-c48901ad8369 2017-03-08 22:24:48,788 DEBUG Command Status of statement 8 is running. 2017-03-08 22:24:50,920 DEBUG Command Status of statement 8 is running. ...and it never comes back. On the livy end, I see 17/03/08 17:26:26 INFO ContextLauncher: 17/03/08 17:26:26 INFO scheduler.DAGScheduler: ResultStage 17 (collect at <stdin>:5) finished in 1.521 s 17/03/08 17:26:26 INFO ContextLauncher: 17/03/08 17:26:26 INFO scheduler.DAGScheduler: Job 8 finished: collect at <stdin>:5, took 3.729078 s 17/03/08 17:26:27 DEBUG RpcDispatcher: [ClientProtocol] Registered outstanding rpc 230 (com.cloudera.livy.rsc.BaseProtocol$GetReplJobResult). 17/03/08 17:26:27 DEBUG KryoMessageCodec: Encoded message of type com.cloudera.livy.rsc.rpc.Rpc$MessageHeader (6 bytes) 17/03/08 17:26:27 DEBUG KryoMessageCodec: Encoded message of type com.cloudera.livy.rsc.BaseProtocol$GetReplJobResult (91 bytes) 17/03/08 17:26:27 DEBUG KryoMessageCodec: Decoded message of type com.cloudera.livy.rsc.rpc.Rpc$MessageHeader (6 bytes) 17/03/08 17:26:27 DEBUG KryoMessageCodec: Decoded message of type com.cloudera.livy.rsc.rpc.Rpc$NullMessage (2 bytes) 17/03/08 17:26:27 DEBUG RpcDispatcher: [ClientProtocol] Received RPC message: type=REPLY id=230 payload=com.cloudera.livy.rsc.rpc.Rpc$NullMessage 17/03/08 17:26:28 DEBUG RpcDispatcher: [ClientProtocol] Registered outstanding rpc 231 (com.cloudera.livy.rsc.BaseProtocol$GetReplJobResult). ad infinitum So, with my limited knowledge, it looks to me that Livy thinks it has sent a result to a finished job, but pyspark hasn't received it. Anyone seen this before? Any thoughts?
... View more
03-07-2017
12:59 PM
Hello, I am running Ipython -> livy to send jobs to my CDH 5.9.0 cluster running Spark. My job runs through a few operations reading files from HDFS into dataframes and then doing some operations on those dataframes. The code then reaches a cell with a join and stops progressing. If I leave it along for long enough, the session is eventually killed. I am not sure how to debug this. Yarn shows the job as still running. Spark shows all jobs completed and no active of pending jobs. All the Spark jobs say that they succeeded though some were skipped. If I go to the details for the last stage, all statuses say "Success." The logs for the executors all say Finished task ###. #### bytes sent to driver. The thread dump for the driver shows a lot of waiting threads. If I run the job via pyspark, not through Ipython/Livy, it works fine. But there are no errors in the livy log either. I'm not sure how to figure this out. Any thoughts? Thanks!
... View more
Labels:
- Labels:
-
Apache Spark
-
Apache YARN