Member since
09-25-2015
9
Posts
11
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
4912 | 12-11-2015 09:55 PM |
12-11-2015
09:55 PM
4 Kudos
R is an RDD. So r1 is also an RDD. So you are trying to call "parallelize()" on an RDD, where you should not do that. Usually, use parallelize() on a local python object, like a list.
... View more
12-10-2015
02:23 PM
1 Kudo
A few clarifying questions about rawTrainData: - How is this RDD generated? - Is it cached? - how many partitions does it have? Also, what is the variable "valores"?
... View more
12-08-2015
10:23 PM
Can u please post the full code and error log?
... View more
11-11-2015
04:05 PM
2 Kudos
The pyspark shell is just Python too. So using dir() should show all existing python variables (although it also shows all imports and a bunch of things you may not be looking for).
... View more
10-29-2015
09:06 PM
Sorry don't have anything ready, but sounds like a good idea to make this. What criteria are we looking to compare by?
... View more
10-23-2015
03:50 PM
Looks like some conflict b/w Spark and Phoenix jars. No? Googling on the data in the stack trace, it looks related to Jackson. I'm not familair with Phoenix - does it use it's own version?
... View more