Member since
08-11-2014
481
Posts
92
Kudos Received
72
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2984 | 01-26-2018 04:02 AM | |
6277 | 12-22-2017 09:18 AM | |
3017 | 12-05-2017 06:13 AM | |
3279 | 10-16-2017 07:55 AM | |
9297 | 10-04-2017 08:08 PM |
11-20-2014
01:31 AM
I would not mess with the installed JARs. In fact it's possible something has been changed inadvertently in this process. My guess is that you are packaging Spark with your app? and it is not the same version. You do not package Spark code with a Spark app as it is provided.
... View more
11-19-2014
01:53 AM
It looks like you asked for more resources than you configured YARN to offer, so check how much you can allocate in YARN and how much Spark asked for. I don't know about the ERROR; it may be a red herring. Please have a look at http://spark.apache.org/docs/latest/ for pretty good Spark docs.
... View more
11-19-2014
01:08 AM
1 Kudo
Yes, in that example you are clearly running on YARN. So you see it in the history, right? It looks like the example uses yarn-cluster mode, which means the driver was launched on YARN, not locally. The output will be on the YARN container that had the driver. Try yarn-client instead to make your local process the driver and it should print the result on your console.
... View more
11-19-2014
12:13 AM
Are you running Spark on YARN, or using Spark standalone? if the latter, you won't see any YARN history since it's not using YARN.
... View more
11-19-2014
12:10 AM
1 Kudo
I'm not suggesting you log in as spark or a superuser. You shouldn't do this. Instead, change your app to not access directories you don't have access to as your user.
... View more
11-19-2014
12:07 AM
Yes I know what commutativity and associativity are, I was wondering how it related to Hadoop and decision forests. In theory a reduce function should be commutative and associative, but in practice it does not need to be in MapReduce, and a MapReduce as a unit is not, and certainly Spark is not. There is no practical computation paradigm limitation of this form. I looked into the MLlib RDF code and it does look like it selects features too at random, depending on the configuration. So you could say it bags by examples and features. The oryx implementation also certainly does all of what you describe. https://github.com/cloudera/oryx/tree/master/rdf-computation
... View more
11-19-2014
12:03 AM
That sounds like a bad command line. I don't see that path in the instructions either. Check that you are following the instructions for 5.2 in the previous link.
... View more
11-18-2014
01:57 AM
You should use the documentation for CDH 5.2, which you are using and which corresponds to Spark 1.1: http://www.cloudera.com/content/cloudera/en/documentation/core/latest/topics/cdh_ig_running_spark_apps.html You are looking at docs for CDH 5.0.x, which corresponds to Spark 0.9. A lot has changed since then.
... View more
10-29-2014
09:17 AM
By default the distributed implementations will prune infrequent items and low similarity to scale up. With a tiny data set, this can mean some items are removed entirely, or become un-recommendable. While you can change this behavior, it is not useful to run the Hadoop-based job on such a small data set.
... View more
10-29-2014
09:16 AM
There is no user-based recommender based on Hadoop MapReduce. The closest thing is indeed an item-based implementation in https://github.com/apache/mahout/tree/master/mrlegacy/src/main/java/org/apache/mahout/cf/taste/hadoop/item It is also a recommender, but no it is always using item similarity.
... View more
- « Previous
- Next »