Member since
05-30-2016
25
Posts
5
Kudos Received
1
Solution
My Accepted Solutions
Title | Views | Posted |
---|---|---|
5057 | 06-01-2016 03:07 PM |
07-20-2016
02:42 PM
@Benjamin Leonhardi
This is what I think that I can do KMeansModel clusters = KMeans.train(parsedData.rdd(), numClusters, numIterations);
JavaRDD<Integer> clusterPoints = clusters.predict(parsedData);
List<Integer> list = clusterPoints.toArray();
... View more
07-20-2016
02:31 PM
@Benjamin Leonhardi thank you for your answer , Can you tell me please how to extract the Cluster information as List<Integer> where this list contain coordinates for Clustered Data ?
... View more
07-20-2016
12:44 PM
Hello, Can you please explain to me what kind of data I got when I use Spark Clustering from Mllib like the following KMeansModel clusters = KMeans.train(parsedData.rdd(), numClusters, numIterations);
... View more
Labels:
- Labels:
-
Apache Spark
07-20-2016
09:25 AM
@Marco Gaido thank you for you answer it's really helpful, Can you please tell me how to store the vectors to the HDFS after converting them and then read them from the HDFS to use them in Spark kmean for clustering as KMeansModel clusters = KMeans.train
... View more
07-20-2016
07:27 AM
@Arun A K thank you for your answer , I have Vector not List of RDD Second I am using Java
... View more
07-19-2016
03:03 PM
I wrote Vector's (org.apache.spark.mllib.linalg.Vector) to the HDFS as the following public void writePointsToFile(Path path, FileSystem fs, Configuration conf,
List<Vector> points) throws IOException {
SequenceFile.Writer writer = SequenceFile.createWriter(conf,
Writer.file(path), Writer.keyClass(LongWritable.class),
Writer.valueClass(Vector.class));
long recNum = 0;
for (Vector point : points) {
writer.append(new LongWritable(recNum++), point);
}
writer.close();
}
( not sure that I used the right way to do that can't test it yet ) now I need to read this file as JavaRDD<Vector> because I want to use it in Spark Clustering K-mean but don't know how to do this.
... View more
Labels:
- Labels:
-
Apache Hadoop
-
Apache Spark
07-19-2016
11:56 AM
I have a VectorWritable (org.apache.mahout.math.VectorWritable) which is coming from a sequence file generated by Mahout something like the following. publicvoid write(List<Vector> points,int clustersNumber,HdfsConnector connector)throwsIOException{this.writePointsToFile(newPath(connector.getPointsInput(),"pointsInput"), connector.getFs(), connector.getConf(), points);Path clusterCentroids =newPath(connector.getClustersInput(),"part-0");SequenceFile.Writer writer =SequenceFile.createWriter(
connector.getConf(),Writer.file(clusterCentroids),Writer.keyClass(Text.class),Writer.valueClass(Kluster.class));List<Vector> centroids = getCentroids;for(int i =0; i < centroids.size(); i++){Vector vect = centroids.get(i);Kluster centroidCluster =newKluster(vect, i,newSquaredEuclideanDistanceMeasure());
writer.append(newText((centroidCluster).getIdentifier()),
centroidCluster);}
writer.close();} and I would like to convert that into Vector (org.apache.spark.mllib.linalg.Vectors) type Spark as JavaRDD<Vector> How can I do that in Java ? I've read something about sequenceFile in Spark but I couldn't figure out how to do it.
... View more
Labels:
- Labels:
-
Apache Spark
06-01-2016
03:07 PM
1 Kudo
In sandbox 2.4 the default username and password marie_dev have a read permission for that you need to reset the username and password for admin , you can do that by lancing the script ambari-admin-password-reset after that you can login to Ambari by the username and password you just entered ,and there you have you admin permission 🙂
... View more