08-07-2014 03:32 AM
I have CDH 5.1 installed on a 5 node cluster. I am building up a spark program from a series of REPL commands but am experiencing unexpected behaviour
the commnads are as follows
case class Reading(accountId: String, senderId: String, sensorId: String, metricName: String , readTime: Long , value: Double)
var r2 =Reading("sss","fff","FGGF","hjjj", 232L, 22.3)
import scala.collection.mutable.ListBuffer
var readings = ListBuffer[Reading]()
readings.append(r2)
The last line thows a mismatch error
<console>:18: error: type mismatch;
found : Reading
required: Reading
readings.append(r2)
This works as expected from a standalone instance of spark and scala on an Ubuntu box , CDH is installed on Centos where java versions differ , Centos uses oracle jdk, whilst Ubuntu uses OpenJDK . I have tried the code on several instances of CDH that we have here locally and the issue is the same,
I would expect the spark-shell repl to have full scala functionality , would this be a valid assumption ?
If some else could try the same commands on a CDH instance I would be grateful to know if it worked as expected or not
vr
Hugh McBride
08-07-2014 03:33 AM
I believe this is an instance of the bug SPARK-1199 (https://issues.apache.org/jira/browse/SPARK-1199) which will be fixed in Spark 1.1.0.
08-07-2014 03:39 AM
Thanks for the prompt reply, so if I use a regular scala class , in a standalone spark job I should be pretty safe I am guessing .
vr
Hugh McBride
08-07-2014 03:41 AM
I think it's specific to case classes, and specific to the shell. The shell is just a little fork of Scala REPL so in general it should work just the same, but I think the Spark modifications esp. regarding closure cleaning introduced this bug.