Member since
08-11-2014
481
Posts
92
Kudos Received
72
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2984 | 01-26-2018 04:02 AM | |
6290 | 12-22-2017 09:18 AM | |
3019 | 12-05-2017 06:13 AM | |
3280 | 10-16-2017 07:55 AM | |
9308 | 10-04-2017 08:08 PM |
02-22-2015
08:52 AM
So, your app only has 3 cores from YARN? then your app can only be executing 3 tasks in parallel. I'm not sure how many receivers you are starting, but is that less? It sounds like you expected much more resource to be avialable, so I'd go look at your YARN config and what's using the resource and compare to what Spark is actually requesting.
... View more
02-22-2015
08:29 AM
Go the the Spark UI and look at the top of the screen -- click Executors
... View more
02-22-2015
08:11 AM
You usually use --executory-memory to set executor memory but I don't think it matters. You also generally do not use env variables to configure spark-shell. Although it might be giving the desird results, i'd use standard command line flags. It sounds like simpler jobs are working. While you request 8 executors do you actually get them from YARN? go look at your executors tab.
... View more
02-22-2015
05:04 AM
OK, what I'm interested in is how many executor slots you have. How many machines, how many executors, how many cores per executor? we want to confirm it's at least as many as the number of receivers. what about a simpler test involving a file-based DStream? if that works then it rules out much except the custom DStream.
... View more
02-22-2015
04:45 AM
What is your master set to? It needs to allow for all the receivers, plus one, IIRC.
... View more
02-16-2015
07:44 AM
Nah it's not crazy, just means you have to do some of the work that spark-submit does. Almost all of that is dealing with the classpath. If you're just trying to get a simple app running I think spark-submit is the way to go. But if you're building a more complex product or service you might have to embed Spark and deal with this. Example, I had to do just this recently, and here's what I came up with: https://github.com/OryxProject/oryx/tree/master/bin In future versions (like 1.4+) there's going to be a more proper programmatic submission API. I know Marcelo here has been working on that.
... View more
02-16-2015
07:25 AM
In general, you do not run Spark applications directly as Java programs. You have to run them with spark-submit, which sets up the classpath for you. Otherwise you have to set it up, and that's the problem here; you didn't put all of the many YARN / Hadoop / Spark jars on your classpath. spark-yarn and yarn-parent were discontinued in 1.2.0, but then brought back very recently for 1.2.1. You can see it doesn't exist upstream for 1.2.0: http://search.maven.org/#search%7Cgav%7C1%7Cg%3A%22org.apache.spark%22%20AND%20a%3A%22yarn-parent_2.10%22 CDH 5.3 was based on 1.2.0, so that's why. That said, that is not the artifact you are missing here. You don't even have YARN code on your classpath.
... View more
01-08-2015
02:17 AM
Not supported, but, all of the standard bits are there. It should work just like any other installation. You will probably have to put the Hive jars on your classpath manually, says Marcelo.
... View more
12-03-2014
06:40 AM
What do you mean that item IDs don't get generated? User and item IDs can be strings. Snappy is required, but no particular version is. I don't know of any bugs in Snappy. It does not depend on Snappy directly but simply requires that Hadoop have Snappy codecs available. However it does end up embedding the hadoop-client directly to access HDFS, and maybe there is possibility of a version mismatch here. Did you build the binary to match your version of Hadoop? that's the safest thing. What are you using? IDs don't matter. If they are strings representing long values they are used directly (i.e. "123" hashes to 123). Random splitting does change results a bit from run to run but shouldn't result in a consistent difference. OK, I'll try to get time to try it myself.
... View more
11-21-2014
01:07 AM
Yes but did you also submit your Spark app to YARN? what is your master for the app?
... View more