Member since
08-11-2014
481
Posts
92
Kudos Received
72
Solutions
My Accepted Solutions
Title | Views | Posted |
---|---|---|
3089 | 01-26-2018 04:02 AM | |
6517 | 12-22-2017 09:18 AM | |
3144 | 12-05-2017 06:13 AM | |
3407 | 10-16-2017 07:55 AM | |
9759 | 10-04-2017 08:08 PM |
02-07-2017
08:22 PM
How i can know if the parcels files should has sha or sha1 signature? When Spark2 will be part of CDH4 parcels, which version? currently i have 1.6, when i add 2.0 will the spark history server be 2.0 also? if i used 2.0 parcels, which one will be the active one on the cluster 1.6 or 2.0?
... View more
01-16-2017
09:06 PM
@justin3113 to run jobs across all nodes a user must exist on each node, I'd justin3113 for example. And each user needs a HDFS user directory under /user in HDFS, the user must have read and write access. This is so the job can write temporary data to HDFS from whatever node the job is running. The error is stating that it is trying to create that user directory but only the hdfs user has that permission. Opening up access gets around it but that is not advisable. You should run for each user su - hdfs hdfs dfs -mkdir /user/justin3113.
... View more
12-29-2016
02:26 AM
i'm using sbt, should i use spark-submit everytime we need to run a project? SBT run, is catering my needs for now, as im using it in local mode.
... View more
12-16-2016
09:08 AM
It won't be terribly different -- like a maintenance release generally contains a small number of fixes -- but yes you will want to update it in general. You will need the GA version if you want production support, too.
... View more
11-17-2016
10:56 PM
Hi, I am following steps from the following link for RHadoop installation on cloudera https://ashokharnal.wordpress.com/2013/08/25/installing-r-rhadoop-and-rstudio-over-cloudera-hadoop-ecosystem/#comment-2441 Will it work for cloudera 1.6? Thanks.
... View more
11-16-2016
03:12 AM
<repository> <id>Cloudera Repository</id> <url>https://repository.cloudera.com/content/repositories/releases/</url> </repository> <repository> <id>Cloudera Beta Repository</id> <url>https://repository.cloudera.com/artifactory/cloudera-repos/</url> </repository> I'm using this links 🙂
... View more
11-09-2016
08:02 PM
hi were you able to resolve this problem? ip-10-0-0-5.ec2.internal, executor 1): java.lang.AbstractMethodError at org.apache.spark.Logging$class.log(Logging.scala:50) at org.apache.spark.streaming.twitter.TwitterReceiver.log(TwitterInputDStream.scala:60) at org.apache.spark.Logging$class.logInfo(Logging.scala:58) at org.apache.spark.streaming.twitter.TwitterReceiver.logInfo(TwitterInputDStream.scala:60) at org.apache.spark.streaming.twitter.TwitterReceiver.onStart(TwitterInputDStream.scala:96) at org.apache.spark.streaming.receiver.ReceiverSupervisor.startReceiver(ReceiverSupervisor.s
... View more
09-26-2016
01:29 AM
1 Kudo
The essential point here is that you want to avoid a shuffle, and you can avoid a shuffle if both RDDs are partitioned in the same way, because then all values for the same key are already on 1 partition in each RDD. join calls cogroup so yes both can accomplish this, as long as both RDDs have the same partitioner. This won't be true, however, if you first flatMap one of the RDDs which can't be known to retain the partitioning.
... View more
09-03-2016
05:12 AM
Yes, exactly, I mean this. I would like to copy the file result in my local machine.
... View more