Member since
08-11-2014
481
Posts
92
Kudos Received
72
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
2984 | 01-26-2018 04:02 AM | |
6277 | 12-22-2017 09:18 AM | |
3017 | 12-05-2017 06:13 AM | |
3279 | 10-16-2017 07:55 AM | |
9297 | 10-04-2017 08:08 PM |
11-12-2017
09:26 AM
Nothing about a cluster would prevent it from making external connections, but your firewall rules might. The variabbles you export here are not related to Spark. It's an error from the library you're using.
... View more
10-25-2017
10:27 AM
Oops I meant to write S3 paths. Really, it's Hadoop and its APIs that supports / doesn't support S3. It should be built in to Hadoop distributions however. I believe you might need an s3a:// protocol instead of s3://
... View more
10-25-2017
02:52 AM
Even I have forgotten exactly how it works off the top of my head, but yes, you are correct that you should be able to use HDFS paths. Yes it runs on Java 7 -- or 8, I believe, though I don't recall if that was tested. It doesn't require Java 8.
... View more
10-16-2017
07:55 AM
1 Kudo
Really, this is just saying you can upload data at project creation time or later from your local computer to the local file system that the Python/R/Scala sessions see in their local file system. Those jobs then see those local files as simple files, and can do what they like with them. But you can also within the same program access whatever data you want, anywhere you want; you just need to write code that does so. Via Spark or whatever library you want you can also access whatever data sources you want, as well. There is no either/or here.
... View more
10-06-2017
12:05 AM
Are you looking for the .jar files that were produced as part of the release? those are still in the repo and will stay there indefinitely as far as I know, just because it could be part of people's builds: https://repository.cloudera.com/artifactory/cloudera-repos/com/cloudera/oryx/
... View more
10-04-2017
08:08 PM
Oh, I forgot, we have made many obsolete repos in github.com/cloudera private. I can still see it but of course you can't. Here's a tarball of the final release: https://drive.google.com/open?id=0B_hfrkaWlLi4MVlxQWVJaVd0ZGs If there's any significant demand, I could revive the repo in my personal account
... View more
10-04-2017
07:55 PM
That implementation is obsolete at this point, I'd say, but sure you're welcome to go dig it out. It worked well. The releases and source are still on the 1.x project site: https://github.com/cloudera/oryx/releases
... View more
07-23-2017
01:53 AM
1 Kudo
CDH already supports Spark 2.2, right?
... View more
07-22-2017
12:34 AM
Unsupported != doesn't work. Spark Streaming is shipped as-is and you can use structured streaming. The distro wouldn't include breaking changes to public APIs even where not supported.
... View more
07-21-2017
01:49 PM
Yes you have two services for the history servers. Yes you need to build your app vs Spark 1 or Spark 2 and then run with the right version.
... View more