Member since
07-29-2013
366
Posts
69
Kudos Received
71
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
6216 | 03-09-2016 01:21 AM | |
5038 | 03-07-2016 01:52 AM | |
15088 | 02-29-2016 04:40 AM | |
4747 | 02-22-2016 03:08 PM | |
5751 | 01-19-2016 02:13 PM |
11-18-2014
01:07 PM
It means basically what it says, that you're writing some program that accesses /user/spark but you're not running as spark, the user that can access that directory.
... View more
11-18-2014
12:06 PM
You're running on YARN, so you should see the application as a "FAILED" application in the Resource Manager UI. Click through and you can find the logs of individual containers, which should show some failure.
... View more
11-18-2014
10:54 AM
This isn't the problem. It's just a symptom of the app failing for another reason: 14/11/18 16:27:36 ERROR YarnClientSchedulerBackend: Yarn application already ended: FAILED You'd have to look at the actual app worker logs to see why it's failing.
... View more
11-18-2014
10:52 AM
1 Kudo
Yes, you shouldn't be able to run this as a stand-alone app. Hm, try putting the jar file last? that is how the script says to do it.
... View more
11-18-2014
09:22 AM
You need <scope>provided</scope> as well.
... View more
11-18-2014
08:42 AM
1 Kudo
How are you executing this? it sounds like you may not be using spark-submit, or, are accidentally bundling Spark (perhaps a slightly different version) into your app. Spark deps should be 'provided' in your build and you'll want to use spark-submit to submit. You don't set master in your SparkConf in code.
... View more
11-18-2014
08:34 AM
Hm, what do you mean by commutative and associative? and do you mean Hadoop clusters? I'm not sure there's a particular limit to what a Hadoop cluster can do well other than that it's fundamentally a data-parallel paradigm. But most things can be done efficiently in this paradigm, especially random forests. The only things that don't work well are things that require extremely fast async communication -- MPI-style computations. Decision forests are strong and you can certainly do them well on Hadoop.
... View more
11-17-2014
01:38 AM
1 Kudo
Random decision forests in MLlib 1.2 can do classification or regression. Yes it can do bagging. I don't believe it's by feature, no.
... View more
11-16-2014
01:38 PM
1 Kudo
MLlib supports SVMs in Spark 1.1. It supports Decision Trees in 1.1, and Decision Forests in 1.2, which is not quite yet released. Mahout has an implementation of SVMs and Decision Forests. They are both fairly old and MapReduce-based.
... View more
11-11-2014
05:22 AM
1 Kudo
Hm, why not just use the Spark that is part of CDH? If you want 1.1, can you update to CDH 5.2? Are there more logs? this isn't the underlying error.
... View more