Member since
08-11-2014
481
Posts
92
Kudos Received
72
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3030 | 01-26-2018 04:02 AM | |
6378 | 12-22-2017 09:18 AM | |
3063 | 12-05-2017 06:13 AM | |
3321 | 10-16-2017 07:55 AM | |
9498 | 10-04-2017 08:08 PM |
03-12-2015
05:47 AM
I don't think there's anything special to know, beyond what's documented in the RHadoop subprojects. So it's not something that we ship, support or document separately. I have set up the rhadoop libraries with CDH and it's straightforward. It's really a set of client side libraries that you install into *R*, not *Hadoop*. However to run rmr2 you will need R installed locally on all of your Hadoop cluster nodes, since it will run MapReduce jobs that execute R scripts. I recall that you have to install a bunch of other R packages before installing the rhdfs/rhbase/plyrmr libraries, and I found this in my notes as the set of prerequisites: install.packages(c("Rcpp", "RJSONIO", "bitops", "digest", "functional", "reshape2", "stringr", "plyr", "caTools", "rJava", "dplyr", "R.methodsS3", "Hmisc"))
... View more
03-11-2015
11:46 AM
In general, NoSuchMethodError in Java means you compiled against one version of something, and executed against a different version. Check your build.
... View more
03-11-2015
07:21 AM
You can also just add the HIve jars to your app classpath. The catch here is that Spark doesn't quite support the later version of Hive in CDH. This might work for what you're trying to do, but if you build your own, you're building for a slightly different version of Hive than you run here.
... View more
03-09-2015
03:02 AM
This is answered a few times already here. Have a look at for example http://community.cloudera.com/t5/Advanced-Analytics-Apache-Spark/I-am-using-a-hive-cotext-in-pyspark-cdh5-3-virtual-box-and-i-get/m-p/24418#U24418 The short answer is that Spark is not entirely compatible with recent versions of Hive found in CDH, but may still work for a lot of use cases. The Spark bits are still there. You have to add Hive to the classpath yourself.
... View more
03-06-2015
12:08 PM
No, it's not something you really install. It's a library. However I think you'll find the binaries already exist on all the nodes in your cluster anyway.
... View more
03-06-2015
11:43 AM
In bash for example: unset VARIABLE
... View more
03-06-2015
11:34 AM
It's just an env variable; you can always "export VARIABLE=VALUE" in the shell. But this message is not an error. In fact you want to see this if you intend to run on a cluster.
... View more
03-06-2015
08:32 AM
Mahout is already shipped with CDH. It's not something you really install; it's a library. Can you say any more about the problem you are facing?
... View more
03-05-2015
05:04 AM
1 Kudo
Yes but it's a member of a class. When the class is instantiated on the remote worker, it is null again. Make the Broadcast a member of the new function you are defining.
... View more
03-05-2015
03:27 AM
What is null? I don't see you using a broadcast fariable in a closure here. You just put one in a static member, which isn't going to work.
... View more