Member since
08-11-2014
481
Posts
92
Kudos Received
72
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3009 | 01-26-2018 04:02 AM | |
6345 | 12-22-2017 09:18 AM | |
3039 | 12-05-2017 06:13 AM | |
3304 | 10-16-2017 07:55 AM | |
9421 | 10-04-2017 08:08 PM |
05-21-2017
09:41 AM
2.4.x should work, does it not? or do you need 0.10.1.1+ for compatibility with the 0.10.1.1 broker?
... View more
05-16-2017
02:05 AM
1 Kudo
This is some error caused by your app, rather than a Spark issue. You need to find the executor logs from the app and see what happened.
... View more
05-13-2017
01:54 AM
Yes, been available for a while, but it's a separate parallel install so as to not replace Spark 1.x https://www.cloudera.com/downloads/spark2/2-1.html
... View more
04-24-2017
09:40 AM
(BTW I went ahead and made a 2.4.0 release to have something official and probably-working out there. It worked on my CDH 5.11 + Spark 2.1 + Kafka 0.10.0 cluster. Yes that's a minor problem in the log message. It should reference newYTYSolver. I'll fix that but it shouldn't otherwise affect anything.
... View more
04-21-2017
08:16 AM
Oh, now I see the same 'disconnected' problem you did. It turns out that Kafka 0.10.0 and 0.10.1 are not protocol-compatible, which is quite disappointing. So I think I'm going to have to back up and revert master/2.4 to Kafka 0.10.0, because that's the flavor that CDH is on and would like to avoid having two builds to support 0.10.0 vs 0.10.1. I hope that isn't a big deal to switch back in your prototype?
... View more
04-20-2017
08:18 AM
Yes, good catch. I'll track that at https://github.com/OryxProject/oryx/issues/329 and fix it in a few minutes.
... View more
04-20-2017
06:38 AM
Although I haven't tested anything like that, it's just using really standard APIs in straightforward ways, so, I'm not surprised if S3 just works because HDFS can read/write S3 OK. I know there are some gotchas with actually using S3 as intermediate storage in Spark jobs, but I think your EMR jobs are using local HDFS for that.
... View more
04-20-2017
03:26 AM
Yes, they're all only coupled by Kafka, so you could run these layers quite separately except that they need to share the brokers. It probably won't fit EMR's model as both should run concurrently, and, should run continuously. I'm not sure if it can help you with a shared Kafka either. Obviously it's also an option to run CDH on AWS if you want to try that. Serving layer does not _generally_ use HDFS unless the model is so big that Kafka can't represent parts of it. Then it will write to HDFS and read from it. This really isn't great but it's the best I could do for now for really large models. This is something that could be improved at some point, I hope. If you tune Kafka to allow very large models you can get away without HDFS access.
... View more
04-20-2017
03:09 AM
OK, is it largely working then? If it looks like the app is running, then I'll move to test 2.4 on my cluster too and if it looks good, go ahead and cut a release.
... View more
04-19-2017
10:55 AM
Hm, I don't recall seeing the 'disconnected' message. Is there more detail? On its face it seems like the serving layer can't see the broker? do some ports need to be opened?
... View more