Created on 08-25-2016 09:51 PM - edited 09-16-2022 08:41 AM
Hello Cloudera
I have an update on my NAN problem
I have discovered I can use mahout seqdumper to view the vectors written by the mahout arff.vector command to see whether or not it is actualy writing the vectors properly.
I checked all three files: iris.arff.mvc, seeds.arff.mvc and balance.arff.mvc using mahout seqdumper.
It turns out that in fact it was the mahout.arff.vector creating the NaN output which was transferred to my kmeans/canopy and clusterdump output;
Here we can see my seqdumper output for my seeds and iris datasets and my balance scale dataset (which works okay)
Masternode@Masterdatanode ~]$ mahout seqdumper -i /user/Masternode/seeds/seeds_data.arff.mvc > /tmp/seeds/dump.txt 16/08/26 01:52:49 WARN driver.MahoutDriver: No seqdumper.props found on classpath, will use command-line arguments only 16/08/26 01:52:50 INFO common.AbstractJob: Command line arguments: {--endPhase=[2147483647], --input=[/user/Masternode/seeds/seeds_data.arff.mvc], --startPhase=[0], --tempDir=[temp]} 16/08/26 01:52:53 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 16/08/26 01:52:53 INFO compress.CodecPool: Got brand-new decompressor [.deflate] 16/08/26 01:52:53 INFO driver.MahoutDriver: Program took 3827 ms (Minutes: 0.06378333333333333) xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Running on hadoop, using /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf MAHOUT-JOB: /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/mahout/mahout-examples-0.9-cdh5.6.0-job.jar Input Path: /user/Masternode/seeds/seeds_data.arff.mvc Key class: class org.apache.hadoop.io.LongWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:1.0} Key: 1: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:1.0} Key: 2: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:1.0} Key: 3: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:1.0} Key: 4: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:1.0} : : : : Key: 205: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:3.0} Key: 206: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:3.0} Key: 207: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:3.0} Key: 208: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:3.0} Key: 209: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:NaN,5:NaN,6:NaN,7:3.0} Count: 210 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx [Masternode@Masterdatanode ~]$ mahout seqdumper -i /user/Masternode/iris_data/kmeans3/iris.arff.mvc > /tmp/iris_data/dump.txt 16/08/26 03:52:28 WARN driver.MahoutDriver: No seqdumper.props found on classpath, will use command-line arguments only 16/08/26 03:52:29 INFO common.AbstractJob: Command line arguments: {--endPhase=[2147483647], --input=[/user/Masternode/iris_data/kmeans3/iris.arff.mvc], --startPhase=[0], --tempDir=[temp]} 16/08/26 03:52:32 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 16/08/26 03:52:32 INFO compress.CodecPool: Got brand-new decompressor [.deflate] 16/08/26 03:52:32 INFO driver.MahoutDriver: Program took 3746 ms (Minutes: 0.062433333333333334) xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Running on hadoop, using /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf MAHOUT-JOB: /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/mahout/mahout-examples-0.9-cdh5.6.0-job.jar Input Path: /user/Masternode/iris_data/kmeans3/iris.arff.mvc Key class: class org.apache.hadoop.io.LongWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:1.0} Key: 1: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:1.0} Key: 2: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:1.0} Key: 3: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:1.0} Key: 4: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:1.0} : : : : Key: 145: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:3.0} Key: 146: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:3.0} Key: 147: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:3.0} Key: 148: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:3.0} Key: 149: Value: {0:NaN,1:NaN,2:NaN,3:NaN,4:3.0} Count: 150 xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx [Masternode@Masterdatanode ~]$ mahout seqdumper -i /user/Masternode/balance/balance.arff.mvc > /tmp/balance/dump.txt 16/08/26 01:58:33 WARN driver.MahoutDriver: No seqdumper.props found on classpath, will use command-line arguments only 16/08/26 01:58:34 INFO common.AbstractJob: Command line arguments: {--endPhase=[2147483647], --input=[/user/Masternode/balance/balance.arff.mvc], --startPhase=[0], --tempDir=[temp]} 16/08/26 01:58:37 INFO zlib.ZlibFactory: Successfully loaded & initialized native-zlib library 16/08/26 01:58:37 INFO compress.CodecPool: Got brand-new decompressor [.deflate] 16/08/26 01:58:37 INFO driver.MahoutDriver: Program took 3889 ms (Minutes: 0.06481666666666666) xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxMAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Running on hadoop, using /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf MAHOUT-JOB: /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/mahout/mahout-examples-0.9-cdh5.6.0-job.jar Input Path: /user/Masternode/balance/balance.arff.mvc Key class: class org.apache.hadoop.io.LongWritable Value Class: class org.apache.mahout.math.VectorWritable Key: 0: Value: {0:1.0,1:1.0,2:1.0,3:1.0,4:2.0} Key: 1: Value: {0:1.0,1:1.0,2:1.0,3:2.0,4:3.0} Key: 2: Value: {0:1.0,1:1.0,2:1.0,3:3.0,4:3.0} Key: 3: Value: {0:1.0,1:1.0,2:1.0,3:4.0,4:3.0} Key: 4: Value: {0:1.0,1:1.0,2:1.0,3:5.0,4:3.0} : : : : Key: 620: Value: {0:5.0,1:5.0,2:5.0,3:1.0,4:1.0} Key: 621: Value: {0:5.0,1:5.0,2:5.0,3:2.0,4:1.0} Key: 622: Value: {0:5.0,1:5.0,2:5.0,3:3.0,4:1.0} Key: 623: Value: {0:5.0,1:5.0,2:5.0,3:4.0,4:1.0} Key: 624: Value: {0:5.0,1:5.0,2:5.0,3:5.0,4:2.0} Count: 625
Here are the clusters for the balance scale dataset
[Masternode@Masterdatanode ~]$ mahout clusterdump -i /user/Masternode/balance/kmeans-out/clusters-1-final -o /tmp/balance/balance_clusters.txt -p /user/Masternode/balance/kmeans-out/clusteredPoints -dm org.apache.mahout.common.distance.TanimotoDistanceMeasure MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath. Running on hadoop, using /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/hadoop/bin/hadoop and HADOOP_CONF_DIR=/etc/hadoop/conf MAHOUT-JOB: /opt/cloudera/parcels/CDH-5.6.0-1.cdh5.6.0.p0.45/lib/mahout/mahout-examples-0.9-cdh5.6.0-job.jar 16/08/22 23:05:40 WARN driver.MahoutDriver: No clusterdump.props found on classpath, will use command-line arguments only 16/08/22 23:05:40 INFO common.AbstractJob: Command line arguments: {--dictionaryType=[text], --distanceMeasure=[org.apache.mahout.common.distance.TanimotoDistanceMeasure], --endPhase=[2147483647], --input=[/user/Masternode/balance/kmeans-out/clusters-1-final], --output=[/tmp/balance/balance_clusters.txt], --outputFormat=[TEXT], --pointsDir=[/user/Masternode/balance/kmeans-out/clusteredPoints], --startPhase=[0], --tempDir=[temp]} 16/08/22 23:05:44 INFO clustering.ClusterDumper: Wrote 3 clusters 16/08/22 23:05:44 INFO driver.MahoutDriver: Program took 4136 ms (Minutes: 0.06893333333333333) xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
VL-410{n=213 c=[4.038, 2.131, 2.746, 2.446, 1.737] r=[0.968, 1.058, 1.361, 1.287, 0.917]} Weight : [props - optional]: Point: 1.0 : [distance=0.39346254907223044]: [2.000, 1.000, 1.000, 1.000, 1.000] 1.0 : [distance=0.29099665375325723]: [2.000, 1.000, 1.000, 2.000, 2.000] 1.0 : [distance=0.2737469670505256]: [2.000, 1.000, 2.000, 1.000, 2.000] 1.0 : [distance=0.2602217061100983]: [2.000, 1.000, 3.000, 1.000, 3.000] 1.0 : [distance=0.2703429967956935]: [2.000, 1.000, 4.000, 1.000, 3.000] : : VL-82{n=275 c=[1.975, 3.033, 2.669, 3.676, 2.415] r=[1.011, 1.379, 1.344, 1.242, 0.875]} Weight : [props - optional]: Point: 1.0 : [distance=0.4843955662442676]: [1.000, 1.000, 1.000, 1.000, 2.000] 1.0 : [distance=0.3310139832226746]: [1.000, 1.000, 1.000, 2.000, 3.000] 1.0 : [distance=0.25039239421781456]: [1.000, 1.000, 1.000, 3.000, 3.000] 1.0 : [distance=0.21916087567042697]: [1.000, 1.000, 1.000, 4.000, 3.000] 1.0 : [distance=0.23026798575608354]: [1.000, 1.000, 1.000, 5.000, 3.000] : : VL-370{n=140 c=[3.429, 4.271, 4.043, 2.486, 1.579] r=[1.283, 0.877, 1.095, 1.344, 0.854]} Weight : [props - optional]: Point: 1.0 : [distance=0.291345734798266]: [1.000, 2.000, 5.000, 1.000, 3.000] 1.0 : [distance=0.22857469129979302]: [1.000, 3.000, 4.000, 1.000, 3.000] 1.0 : [distance=0.22469106732898325]: [1.000, 3.000, 5.000, 1.000, 3.000] 1.0 : [distance=0.18798153075739144]: [1.000, 3.000, 5.000, 2.000, 3.000] 1.0 : [distance=0.27974934890340164]: [1.000, 4.000, 2.000, 1.000, 1.000] : :
Further on inspecting my balance.arff dataset.
I noticed that the file data were only integers seperated by commas ie
@relation balance-scale @attribute left-weight numeric @attribute left-distance numeric @attribute right-weight numeric @attribute right-distance numeric @attribute class { L, B, R} @data 1,1,1,1,B 1,1,1,2,R 1,1,1,3,R 1,1,1,4,R 1,1,1,5,R 1,1,2,1,R 1,1,2,2,R 1,1,2,3,R 1,1,2,4,R 1,1,2,5,R : :
Whereas my other datasets had doubles and float values as the data
ie for seeds.arff dataset and iris.arff datasets
@relation seeds @attribute area numeric @attribute perimeter numeric @attribute compactness numeric @attribute kernel-length numeric @attribute kernel-width numeric @attribute asymmetry numeric @attribute kernel-groove numeric @attribute class { 1, 2, 3} @data 15.26,14.84,0.871,5.763,3.312,2.221,5.22,1 14.88,14.57,0.8811,5.554,3.333,1.018,4.956,1 14.29,14.09,0.905,5.291,3.337,2.699,4.825,1 13.84,13.94,0.8955,5.324,3.379,2.259,4.805,1 16.14,14.99,0.9034,5.658,3.562,1.355,5.175,1 14.38,14.21,0.8951,5.386,3.312,2.462,4.956,1 14.69,14.49,0.8799,5.563,3.259,3.586,5.219,1 14.11,14.1,0.8911,5.42,3.302,2.7,5,1 16.63,15.46,0.8747,6.053,3.465,2.04,5.877,1 16.44,15.25,0.888,5.884,3.505,1.969,5.533,1 : : xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx @RELATION iris @ATTRIBUTE sepallength numeric @ATTRIBUTE sepalwidth numeric @ATTRIBUTE petallength numeric @ATTRIBUTE petalwidth numeric @ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica} @DATA 5.1,3.5,1.4,0.2,Iris-setosa 4.9,3.0,1.4,0.2,Iris-setosa 4.7,3.2,1.3,0.2,Iris-setosa 4.6,3.1,1.5,0.2,Iris-setosa 5.0,3.6,1.4,0.2,Iris-setosa 5.4,3.9,1.7,0.4,Iris-setosa 4.6,3.4,1.4,0.3,Iris-setosa 5.0,3.4,1.5,0.2,Iris-setosa 4.4,2.9,1.4,0.2,Iris-setosa 4.9,3.1,1.5,0.1,Iris-setosa : :
So THIS is what is causing the problem for the mahout arff.vector command.
It does not seem to like these double and float input data.
Is there any solution to this ???????????????????
I am using Cloudera CDH5 Version 5.6.0-1.cdh5.6.0.p0.45
and Mahout Version 0.9+cdh5.6.0+26
ANY HELP MOST WELCOME !!!!!!!!!!!!!!!!!!!!!!!!!!
Created 11-22-2016 02:51 PM
Hello Everybody !
I found the answer to this problem not too long after my last post.
Since mahout arff.vector command only likes integer input, transform your doubles and float data into integer values (whole numbers).
This is done by mulplying each column of data by 10 raised to the necesaary power.
Eg If you have data like 22.23 then multiply by 100, if you have data like 13.854 multiply by 1000.
NB Always multiply the whole dataset by the same number.
This multiplication by some factor just rescales the data without changing its characterstics.
Here is my rescaled iris data
@RELATION iris @ATTRIBUTE sepallength numeric @ATTRIBUTE sepalwidth numeric @ATTRIBUTE petallength numeric @ATTRIBUTE petalwidth numeric @ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica} @DATA 35,14,2,51,Iris-setosa 30,14,2,49,Iris-setosa 32,13,2,47,Iris-setosa 31,15,2,46,Iris-setosa 36,14,2,50,Iris-setosa 39,17,4,54,Iris-setosa : :
And here my rescaled seeds data
@RELATION seeds @ATTRIBUTE area numeric @ATTRIBUTE perimeter numeric @ATTRIBUTE compactness numeric @ATTRIBUTE kernel_length numeric @ATTRIBUTE kernel_width numeric @ATTRIBUTE asymmetry_coefficient numeric @ATTRIBUTE kernel_groove numeric @ATTRIBUTE class {1,2,3} @DATA 14840,15260,5763,3312,2221,5220,871,1 14570,14880,5554,3333,1018,4956,881,1 14090,14290,5291,3337,2699,4825,905,1 13940,13840,5324,3379,2259,4805,896,1 14990,16140,5658,3562,1355,5175,903,1 14210,14380,5386,3312,2462,4956,895,1 ; :
After this mahout arff.vector works fine producing the required mahout seqdumper output:
For the iris data
Input Path: hdfs://childnode1:8020/user/Masternode/iris_data/kmeansout/clusteredPoints/part-m-00000 Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.clustering.classify.WeightedPropertyVectorWritable Key: 4: Value: wt: 1.0 distance: 1.4694216549377501 vec: [35.000, 14.000, 2.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.381689171997383 vec: [30.000, 14.000, 2.000, 49.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.123008610226189 vec: [32.000, 13.000, 2.000, 47.000, 1.000] Key: 4: Value: wt: 1.0 distance: 5.188371613521854 vec: [31.000, 15.000, 2.000, 46.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.9796969465046923 vec: [36.000, 14.000, 2.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.838069903123147 vec: [39.000, 17.000, 4.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.152011560677545 vec: [34.000, 14.000, 3.000, 46.000, 1.000] Key: 4: Value: wt: 1.0 distance: 0.5993329625508677 vec: [34.000, 15.000, 2.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 8.009943820027594 vec: [29.000, 14.000, 2.000, 44.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.6659514453958333 vec: [31.000, 15.000, 1.000, 49.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.878442374364758 vec: [37.000, 15.000, 2.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 2.513801901502982 vec: [34.000, 16.000, 2.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.919268238264621 vec: [30.000, 14.000, 1.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 9.090610540552246 vec: [30.000, 11.000, 1.000, 43.000, 1.000] Key: 4: Value: wt: 1.0 distance: 10.201921387660274 vec: [40.000, 12.000, 2.000, 58.000, 1.000] Key: 4: Value: wt: 1.0 distance: 12.130919173747719 vec: [44.000, 15.000, 4.000, 57.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.624137679728539 vec: [39.000, 13.000, 4.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.5097019573412485 vec: [35.000, 14.000, 3.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 8.284877790287531 vec: [38.000, 17.000, 3.000, 57.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.989887216451146 vec: [38.000, 15.000, 3.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.6172719218168305 vec: [34.000, 17.000, 2.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.3762701313727606 vec: [37.000, 15.000, 4.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.443539400050163 vec: [36.000, 10.000, 2.000, 46.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.7946277814826366 vec: [33.000, 17.000, 5.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.845534026296767 vec: [34.000, 19.000, 2.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.4180538702010885 vec: [30.000, 16.000, 2.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 2.078268510082371 vec: [34.000, 16.000, 4.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 2.181559075523739 vec: [35.000, 15.000, 2.000, 52.000, 1.000] Key: 4: Value: wt: 1.0 distance: 2.097426995153822 vec: [34.000, 14.000, 2.000, 52.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.019850743497717 vec: [32.000, 16.000, 2.000, 47.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.049592572099055 vec: [31.000, 16.000, 2.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.25666536152411 vec: [34.000, 15.000, 4.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 7.244252894536452 vec: [41.000, 15.000, 1.000, 52.000, 1.000] Key: 4: Value: wt: 1.0 distance: 9.28219801555635 vec: [42.000, 14.000, 2.000, 55.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.6659514453958333 vec: [31.000, 15.000, 1.000, 49.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.4524194414930762 vec: [32.000, 12.000, 2.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 5.287645979072288 vec: [35.000, 13.000, 2.000, 55.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.6659514453958333 vec: [31.000, 15.000, 1.000, 49.000, 1.000] Key: 4: Value: wt: 1.0 distance: 7.555077762670502 vec: [30.000, 13.000, 2.000, 44.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.1131936040060575 vec: [34.000, 15.000, 2.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.9181240835774944 vec: [35.000, 13.000, 3.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 12.39351443296044 vec: [23.000, 13.000, 3.000, 45.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.6602702647864795 vec: [32.000, 13.000, 2.000, 44.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.898615138738256 vec: [35.000, 16.000, 6.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.076117181226863 vec: [38.000, 19.000, 4.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.737003272111904 vec: [30.000, 14.000, 3.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.185594342503789 vec: [38.000, 16.000, 2.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.673242985336684 vec: [32.000, 14.000, 2.000, 46.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.113295515763298 vec: [37.000, 15.000, 2.000, 53.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.413930691370691 vec: [33.000, 14.000, 2.000, 50.000, 1.000] Key: 128: Value: wt: 1.0 distance: 12.27183013149531 vec: [32.000, 47.000, 14.000, 70.000, 2.000] Key: 128: Value: wt: 1.0 distance: 6.845135447338132 vec: [32.000, 45.000, 15.000, 64.000, 2.000] Key: 100: Value: wt: 1.0 distance: 10.234304921679705 vec: [31.000, 49.000, 15.000, 69.000, 2.000] Key: 128: Value: wt: 1.0 distance: 7.318849411742204 vec: [23.000, 40.000, 13.000, 55.000, 2.000] Key: 128: Value: wt: 1.0 distance: 6.3893365248584155 vec: [28.000, 46.000, 15.000, 65.000, 2.000] Key: 128: Value: wt: 1.0 distance: 2.7032373546600645 vec: [28.000, 45.000, 13.000, 57.000, 2.000] Key: 128: Value: wt: 1.0 distance: 7.648597295107581 vec: [33.000, 47.000, 16.000, 63.000, 2.000] Key: 128: Value: wt: 1.0 distance: 15.840467020307052 vec: [24.000, 33.000, 10.000, 49.000, 2.000] : :
: :
Count: 150
And for the seeds data
Input Path: hdfs://childnode1:8020/user/Masternode/seeds/kmeans_out/clusteredPoints/part-m-00000 Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.clustering.classify.WeightedPropertyVectorWritable Key: 24: Value: wt: 1.0 distance: 861.9071668766143 vec: [14840.000, 15260.000, 5763.000, 3312.000, 2221.000, 5220.000, 871.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1668.6900610565315 vec: [14570.000, 14880.000, 5554.000, 3333.000, 1018.000, 4956.000, 881.000, 1.000] Key: 24: Value: wt: 1.0 distance: 694.0022557455687 vec: [14090.000, 14290.000, 5291.000, 3337.000, 2699.000, 4825.000, 905.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1137.750249826391 vec: [13940.000, 13840.000, 5324.000, 3379.000, 2259.000, 4805.000, 896.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2066.340415883421 vec: [14990.000, 16140.000, 5658.000, 3562.000, 1355.000, 5175.000, 903.000, 1.000] Key: 24: Value: wt: 1.0 distance: 508.48008032866846 vec: [14210.000, 14380.000, 5386.000, 3312.000, 2462.000, 4956.000, 895.000, 1.000] Key: 24: Value: wt: 1.0 distance: 939.0255545463335 vec: [14490.000, 14690.000, 5563.000, 3259.000, 3586.000, 5219.000, 880.000, 1.000] Key: 24: Value: wt: 1.0 distance: 693.4040772257339 vec: [14100.000, 14110.000, 5420.000, 3302.000, 2700.000, 5000.000, 891.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2457.5552311826623 vec: [15460.000, 16630.000, 6053.000, 3465.000, 2040.000, 5877.000, 875.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2136.715539711948 vec: [15250.000, 16440.000, 5884.000, 3505.000, 1969.000, 5533.000, 888.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2037.64908599264 vec: [14850.000, 15260.000, 5714.000, 3242.000, 4543.000, 5314.000, 870.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1183.0427661481044 vec: [14160.000, 14030.000, 5438.000, 3201.000, 1717.000, 5001.000, 880.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1668.9093470760495 vec: [14020.000, 13890.000, 5439.000, 3199.000, 3986.000, 4738.000, 888.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1129.8172629441638 vec: [14060.000, 13780.000, 5479.000, 3156.000, 3136.000, 4872.000, 876.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1114.6345778285634 vec: [14050.000, 13740.000, 5482.000, 3114.000, 2932.000, 4825.000, 874.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1616.574316222106 vec: [14280.000, 14590.000, 5351.000, 3333.000, 4185.000, 4781.000, 899.000, 1.000] Key: 177: Value: wt: 1.0 distance: 2237.7216511574297 vec: [13830.000, 13990.000, 5119.000, 3383.000, 5234.000, 4781.000, 918.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1533.017981297028 vec: [14750.000, 15690.000, 5527.000, 3514.000, 1599.000, 5046.000, 906.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1141.7747846041293 vec: [14210.000, 14700.000, 5205.000, 3466.000, 1767.000, 4649.000, 915.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1073.5112523108057 vec: [13570.000, 12720.000, 5226.000, 3049.000, 4102.000, 4914.000, 869.000, 1.000] Key: 24: Value: wt: 1.0 distance: 673.1032757822736 vec: [14400.000, 14160.000, 5658.000, 3129.000, 3072.000, 5176.000, 858.000, 1.000] Key: 24: Value: wt: 1.0 distance: 588.5794743674002 vec: [14260.000, 14110.000, 5520.000, 3168.000, 2688.000, 5219.000, 872.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2307.6221450474227 vec: [14900.000, 15880.000, 5618.000, 3507.000, 765.000, 5091.000, 899.000, 1.000] Key: 24: Value: wt: 1.0 distance: 3165.469289919653 vec: [13230.000, 12080.000, 5099.000, 2936.000, 1415.000, 4961.000, 866.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1022.3111169642691 vec: [14760.000, 15010.000, 5789.000, 3245.000, 1791.000, 5001.000, 866.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2453.594725178613 vec: [15160.000, 16190.000, 5833.000, 3421.000, 903.000, 5307.000, 885.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1842.0639979312034 vec: [13760.000, 13020.000, 5395.000, 3026.000, 3373.000, 4825.000, 864.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2127.2694725299393 vec: [13670.000, 12740.000, 5395.000, 2956.000, 2504.000, 4869.000, 856.000, 1.000] Key: 24: Value: wt: 1.0 distance: 638.121303238346 vec: [14180.000, 14110.000, 5541.000, 3221.000, 2754.000, 5038.000, 882.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1570.180781482017 vec: [14020.000, 13450.000, 5516.000, 3065.000, 3531.000, 5097.000, 860.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2442.635062314189 vec: [13820.000, 13160.000, 5454.000, 2975.000, 855.000, 5056.000, 866.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1252.115173918792 vec: [14940.000, 15490.000, 5757.000, 3371.000, 3412.000, 5228.000, 872.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1405.0316200008115 vec: [14410.000, 14090.000, 5717.000, 3186.000, 3920.000, 5299.000, 853.000, 1.000] Key: 24: Value: wt: 1.0 distance: 954.5724423251518 vec: [14170.000, 13940.000, 5585.000, 3150.000, 2124.000, 5012.000, 873.000, 1.000] Key: 24: Value: wt: 1.0 distance: 729.6383182264915 vec: [14680.000, 15050.000, 5712.000, 3328.000, 2129.000, 5360.000, 878.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1655.6627215851186 vec: [15000.000, 16120.000, 5709.000, 3485.000, 2270.000, 5443.000, 1000.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1818.907545412335 vec: [15270.000, 16200.000, 5826.000, 3464.000, 2823.000, 5527.000, 873.000, 1.000] Key: 76: Value: wt: 1.0 distance: 2107.009330735528 vec: [15380.000, 17080.000, 5832.000, 3683.000, 2956.000, 5484.000, 908.000, 1.000] Key: 24: Value: wt: 1.0 distance: 512.998557374996 vec: [14520.000, 14800.000, 5656.000, 3288.000, 3112.000, 5309.000, 882.000, 1.000] Key: 177: Value: wt: 1.0 distance: 3176.186184965435 vec: [14170.000, 14280.000, 5397.000, 3298.000, 6685.000, 5001.000, 894.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1291.054421730103 vec: [13850.000, 13540.000, 5348.000, 3156.000, 2587.000, 5178.000, 887.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1382.5623224377814 vec: [13850.000, 13500.000, 5351.000, 3158.000, 2249.000, 5176.000, 885.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1853.3448494492047 vec: [13550.000, 13160.000, 5138.000, 3201.000, 2461.000, 4783.000, 901.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2315.520560114591 vec: [14860.000, 15500.000, 5877.000, 3396.000, 4711.000, 5528.000, 882.000, 1.000] Key: 24: Value: wt: 1.0 distance: 695.316069521979 vec: [14540.000, 15110.000, 5579.000, 3462.000, 3128.000, 5180.000, 899.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1478.614814571006 vec: [14040.000, 13800.000, 5376.000, 3155.000, 1560.000, 4961.000, 879.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1508.3478794444047 vec: [14760.000, 15360.000, 5701.000, 3393.000, 1367.000, 5132.000, 886.000, 1.000] Key: 24: Value: wt: 1.0 distance: 481.97134871269895 vec: [14560.000, 14990.000, 5570.000, 3377.000, 2958.000, 5175.000, 888.000, 1.000] Key: 24: Value: wt: 1.0 distance: 183.71730361238392 vec: [14520.000, 14790.000, 5545.000, 3291.000, 2704.000, 5111.000, 882.000, 1.000] Key: 24: Value: wt: 1.0 distance: 630.6956414080009 vec: [14670.000, 14860.000, 5678.000, 3258.000, 2129.000, 5351.000, 868.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1346.3622667032168 vec: [14400.000, 14430.000, 5585.000, 3272.000, 3975.000, 5144.000, 875.000, 1.000] Key: 24: Value: wt: 1.0 distance: 3192.1534037702677 vec: [14910.000, 15780.000, 5674.000, 3434.000, 5593.000, 5136.000, 892.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1513.9463214768837 vec: [14610.000, 14490.000, 5715.000, 3113.000, 4116.000, 5396.000, 854.000, 1.000] Key: 24: Value: wt: 1.0 distance: 778.4078250448471 vec: [14280.000, 14330.000, 5504.000, 3199.000, 3328.000, 5224.000, 883.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1243.4189505293368 vec: [14600.000, 14520.000, 5741.000, 3113.000, 1481.000, 5487.000, 856.000, 1.000] Key: 24: Value: wt: 1.0 distance: 915.6820426338833 vec: [14770.000, 15030.000, 5702.000, 3212.000, 1933.000, 5439.000, 866.000, 1.000] Key: 24: Value: wt: 1.0 distance: 365.8725763036213 vec: [14350.000, 14460.000, 5388.000, 3377.000, 2802.000, 5044.000, 882.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1551.4815116461734 vec: [14430.000, 14920.000, 5384.000, 3412.000, 1142.000, 5088.000, 901.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1041.0816986203454 vec: [14770.000, 15380.000, 5662.000, 3419.000, 1999.000, 5222.000, 886.000, 1.000] Key: 24: Value: wt: 1.0 distance: 3069.1391786047893 vec: [13470.000, 12110.000, 5159.000, 3032.000, 1502.000, 4519.000, 839.000, 1.000] Key: 177: Value: wt: 1.0 distance: 2234.4099731961946 vec: [12860.000, 11420.000, 5008.000, 2850.000, 2700.000, 4607.000, 868.000, 1.000] Key: 177: Value: wt: 1.0 distance: 2723.181787028119 vec: [12630.000, 11230.000, 4902.000, 2879.000, 2269.000, 4703.000, 884.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1679.8632384057453 vec: [13190.000, 12360.000, 5076.000, 3042.000, 3220.000, 4605.000, 892.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1525.05309032792 vec: [13840.000, 13220.000, 5395.000, 3070.000, 4157.000, 5088.000, 868.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2603.1739906169455 vec: [13570.000, 12780.000, 5262.000, 3026.000, 1176.000, 4782.000, 872.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2164.8100360288104 vec: [13500.000, 12880.000, 5139.000, 3119.000, 2352.000, 4607.000, 888.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1379.1305505369796 vec: [14370.000, 14340.000, 5630.000, 3190.000, 1313.000, 5150.000, 873.000, 1.000] Key: 24: Value: wt: 1.0 distance: 802.2824508737174 vec: [14290.000, 14010.000, 5609.000, 3158.000, 2217.000, 5132.000, 862.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1230.3844556714478 vec: [14390.000, 14370.000, 5569.000, 3153.000, 1464.000, 5300.000, 873.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1533.2295745642984 vec: [13750.000, 12730.000, 5412.000, 2882.000, 3533.000, 5067.000, 846.000, 1.000] Key: 76: Value: wt: 1.0 distance: 1242.0782987132768 vec: [15980.000, 17630.000, 6191.000, 3561.000, 4076.000, 6060.000, 867.000, 2.000] Key: 76: Value: wt: 1.0 distance: 2284.8314552331644 vec: [15670.000, 16840.000, 5998.000, 3484.000, 4675.000, 5877.000, 862.000, 2.000] Key: 76: Value: wt: 1.0 distance: 1865.3220945799494 vec: [15730.000, 17260.000, 5978.000, 3594.000, 4539.000, 5791.000, 876.000, 2.000] Key: 76: Value: wt: 1.0 distance: 802.7846465017649 vec: [16260.000, 19110.000, 6154.000, 3930.000, 2936.000, 6079.000, 908.000, 2.000] Key: 76: Value: wt: 1.0 distance: 2130.8931341845782 vec: [15510.000, 16820.000, 6017.000, 3486.000, 4004.000, 5841.000, 879.000, 2.000] Key: 76: Value: wt: 1.0 distance: 2497.1535713990975 vec: [15620.000, 16770.000, 5927.000, 3438.000, 4920.000, 5795.000, 864.000, 2.000] Key: 76: Value: wt: 1.0 distance: 1519.304200689462 vec: [15910.000, 17320.000, 6064.000, 3403.000, 3824.000, 5922.000, 860.000, 2.000] Key: 76: Value: wt: 1.0 distance: 2415.437198684845 vec: [17230.000, 20710.000, 6579.000, 3814.000, 4451.000, 6451.000, 876.000, 2.000] Key: 76: Value: wt: 1.0 distance: 1538.7977024244756 vec: [16490.000, 18940.000, 6445.000, 3639.000, 5064.000, 6362.000, 875.000, 2.000] Key: 76: Value: wt: 1.0 distance: 1983.9629595142953 vec: [15550.000, 17120.000, 5850.000, 3566.000, 2858.000, 5746.000, 889.000, 2.000] : :
: :
Count: 210
And for the mahout clusterdump output, ie the clusters:
For the iris data
VL-128{n=62 c=[27.484, 43.935, 14.339, 59.016, 2.226] r=[2.939, 5.048, 2.951, 4.626, 0.418]} Weight : [props - optional]: Point: 1.0 : [distance=12.27183013149531]: [32.000, 47.000, 14.000, 70.000, 2.000] 1.0 : [distance=6.845135447338132]: [32.000, 45.000, 15.000, 64.000, 2.000] 1.0 : [distance=7.318849411742204]: [23.000, 40.000, 13.000, 55.000, 2.000] : : : : VL-4{n=50 c=[34.180, 14.640, 2.440, 50.060, 1.000] r=[0:3.772, 1:1.718, 2:1.061, 3:3.489]} Weight : [props - optional]: Point: 1.0 : [distance=1.4694216549377501]: [35.000, 14.000, 2.000, 51.000, 1.000] 1.0 : [distance=4.381689171997383]: [30.000, 14.000, 2.000, 49.000, 1.000] 1.0 : [distance=4.123008610226189]: [32.000, 13.000, 2.000, 47.000, 1.000] : : : : VL-100{n=38 c=[30.737, 57.421, 20.711, 68.500, 2.947] r=[2.862, 4.821, 2.762, 4.876, 0.223]} Weight : [props - optional]: Point: 1.0 : [distance=10.234304921679705]: [31.000, 49.000, 15.000, 69.000, 2.000] 1.0 : [distance=8.516482308683988]: [30.000, 50.000, 17.000, 67.000, 2.000] 1.0 : [distance=7.773365278708711]: [33.000, 60.000, 25.000, 63.000, 3.000] : : : :
And for the seeds data
VL-76{n=61 c=[16297.377, 18721.803, 6208.934, 3722.672, 3603.590, 6066.098, 885.115, 1.984] r=[470.329, 1087.056, 218.340, 150.079, 1222.928, 222.042, 14.901, 0.127]} Weight : [props - optional]: Point: 1.0 : [distance=2107.009330735528]: [15380.000, 17080.000, 5832.000, 3683.000, 2956.000, 5484.000, 908.000, 1.000] 1.0 : [distance=1242.0782987132768]: [15980.000, 17630.000, 6191.000, 3561.000, 4076.000, 6060.000, 867.000, 2.000] 1.0 : [distance=2284.8314552331644]: [15670.000, 16840.000, 5998.000, 3484.000, 4675.000, 5877.000, 862.000, 2.000] : : : : VL-177{n=77 c=[13274.805, 11964.416, 5229.286, 2872.922, 4759.740, 5088.519, 852.208, 2.766] r=[370.481, 808.956, 141.698, 162.027, 1292.441, 182.253, 22.964, 0.643]} Weight : [props - optional]: Point: 1.0 : [distance=2237.7216511574297]: [13830.000, 13990.000, 5119.000, 3383.000, 5234.000, 4781.000, 918.000, 1.000] 1.0 : [distance=1073.5112523108057]: [13570.000, 12720.000, 5226.000, 3049.000, 4102.000, 4914.000, 869.000, 1.000] 1.0 : [distance=1842.0639979312034]: [13760.000, 13020.000, 5395.000, 3026.000, 3373.000, 4825.000, 864.000, 1.000] : : : : VL-24{n=72 c=[14460.417, 14648.472, 5563.778, 3277.903, 2648.931, 5192.319, 880.556, 1.194] r=[531.885, 1108.561, 218.877, 158.321, 1093.229, 318.248, 21.182, 0.461]} Weight : [props - optional]: Point: 1.0 : [distance=861.9071668766143]: [14840.000, 15260.000, 5763.000, 3312.000, 2221.000, 5220.000, 871.000, 1.000] 1.0 : [distance=1668.6900610565315]: [14570.000, 14880.000, 5554.000, 3333.000, 1018.000, 4956.000, 881.000, 1.000] 1.0 : [distance=694.0022557455687]: [14090.000, 14290.000, 5291.000, 3337.000, 2699.000, 4825.000, 905.000, 1.000] : : : :
So the mahout arff.vector command works fine. Always !!!!
Sorry for the delay in replying !
Created 11-22-2016 02:51 PM
Hello Everybody !
I found the answer to this problem not too long after my last post.
Since mahout arff.vector command only likes integer input, transform your doubles and float data into integer values (whole numbers).
This is done by mulplying each column of data by 10 raised to the necesaary power.
Eg If you have data like 22.23 then multiply by 100, if you have data like 13.854 multiply by 1000.
NB Always multiply the whole dataset by the same number.
This multiplication by some factor just rescales the data without changing its characterstics.
Here is my rescaled iris data
@RELATION iris @ATTRIBUTE sepallength numeric @ATTRIBUTE sepalwidth numeric @ATTRIBUTE petallength numeric @ATTRIBUTE petalwidth numeric @ATTRIBUTE class {Iris-setosa,Iris-versicolor,Iris-virginica} @DATA 35,14,2,51,Iris-setosa 30,14,2,49,Iris-setosa 32,13,2,47,Iris-setosa 31,15,2,46,Iris-setosa 36,14,2,50,Iris-setosa 39,17,4,54,Iris-setosa : :
And here my rescaled seeds data
@RELATION seeds @ATTRIBUTE area numeric @ATTRIBUTE perimeter numeric @ATTRIBUTE compactness numeric @ATTRIBUTE kernel_length numeric @ATTRIBUTE kernel_width numeric @ATTRIBUTE asymmetry_coefficient numeric @ATTRIBUTE kernel_groove numeric @ATTRIBUTE class {1,2,3} @DATA 14840,15260,5763,3312,2221,5220,871,1 14570,14880,5554,3333,1018,4956,881,1 14090,14290,5291,3337,2699,4825,905,1 13940,13840,5324,3379,2259,4805,896,1 14990,16140,5658,3562,1355,5175,903,1 14210,14380,5386,3312,2462,4956,895,1 ; :
After this mahout arff.vector works fine producing the required mahout seqdumper output:
For the iris data
Input Path: hdfs://childnode1:8020/user/Masternode/iris_data/kmeansout/clusteredPoints/part-m-00000 Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.clustering.classify.WeightedPropertyVectorWritable Key: 4: Value: wt: 1.0 distance: 1.4694216549377501 vec: [35.000, 14.000, 2.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.381689171997383 vec: [30.000, 14.000, 2.000, 49.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.123008610226189 vec: [32.000, 13.000, 2.000, 47.000, 1.000] Key: 4: Value: wt: 1.0 distance: 5.188371613521854 vec: [31.000, 15.000, 2.000, 46.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.9796969465046923 vec: [36.000, 14.000, 2.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.838069903123147 vec: [39.000, 17.000, 4.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.152011560677545 vec: [34.000, 14.000, 3.000, 46.000, 1.000] Key: 4: Value: wt: 1.0 distance: 0.5993329625508677 vec: [34.000, 15.000, 2.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 8.009943820027594 vec: [29.000, 14.000, 2.000, 44.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.6659514453958333 vec: [31.000, 15.000, 1.000, 49.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.878442374364758 vec: [37.000, 15.000, 2.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 2.513801901502982 vec: [34.000, 16.000, 2.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.919268238264621 vec: [30.000, 14.000, 1.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 9.090610540552246 vec: [30.000, 11.000, 1.000, 43.000, 1.000] Key: 4: Value: wt: 1.0 distance: 10.201921387660274 vec: [40.000, 12.000, 2.000, 58.000, 1.000] Key: 4: Value: wt: 1.0 distance: 12.130919173747719 vec: [44.000, 15.000, 4.000, 57.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.624137679728539 vec: [39.000, 13.000, 4.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.5097019573412485 vec: [35.000, 14.000, 3.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 8.284877790287531 vec: [38.000, 17.000, 3.000, 57.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.989887216451146 vec: [38.000, 15.000, 3.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.6172719218168305 vec: [34.000, 17.000, 2.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.3762701313727606 vec: [37.000, 15.000, 4.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.443539400050163 vec: [36.000, 10.000, 2.000, 46.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.7946277814826366 vec: [33.000, 17.000, 5.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.845534026296767 vec: [34.000, 19.000, 2.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.4180538702010885 vec: [30.000, 16.000, 2.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 2.078268510082371 vec: [34.000, 16.000, 4.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 2.181559075523739 vec: [35.000, 15.000, 2.000, 52.000, 1.000] Key: 4: Value: wt: 1.0 distance: 2.097426995153822 vec: [34.000, 14.000, 2.000, 52.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.019850743497717 vec: [32.000, 16.000, 2.000, 47.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.049592572099055 vec: [31.000, 16.000, 2.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.25666536152411 vec: [34.000, 15.000, 4.000, 54.000, 1.000] Key: 4: Value: wt: 1.0 distance: 7.244252894536452 vec: [41.000, 15.000, 1.000, 52.000, 1.000] Key: 4: Value: wt: 1.0 distance: 9.28219801555635 vec: [42.000, 14.000, 2.000, 55.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.6659514453958333 vec: [31.000, 15.000, 1.000, 49.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.4524194414930762 vec: [32.000, 12.000, 2.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 5.287645979072288 vec: [35.000, 13.000, 2.000, 55.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.6659514453958333 vec: [31.000, 15.000, 1.000, 49.000, 1.000] Key: 4: Value: wt: 1.0 distance: 7.555077762670502 vec: [30.000, 13.000, 2.000, 44.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.1131936040060575 vec: [34.000, 15.000, 2.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.9181240835774944 vec: [35.000, 13.000, 3.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 12.39351443296044 vec: [23.000, 13.000, 3.000, 45.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.6602702647864795 vec: [32.000, 13.000, 2.000, 44.000, 1.000] Key: 4: Value: wt: 1.0 distance: 3.898615138738256 vec: [35.000, 16.000, 6.000, 50.000, 1.000] Key: 4: Value: wt: 1.0 distance: 6.076117181226863 vec: [38.000, 19.000, 4.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.737003272111904 vec: [30.000, 14.000, 3.000, 48.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.185594342503789 vec: [38.000, 16.000, 2.000, 51.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.673242985336684 vec: [32.000, 14.000, 2.000, 46.000, 1.000] Key: 4: Value: wt: 1.0 distance: 4.113295515763298 vec: [37.000, 15.000, 2.000, 53.000, 1.000] Key: 4: Value: wt: 1.0 distance: 1.413930691370691 vec: [33.000, 14.000, 2.000, 50.000, 1.000] Key: 128: Value: wt: 1.0 distance: 12.27183013149531 vec: [32.000, 47.000, 14.000, 70.000, 2.000] Key: 128: Value: wt: 1.0 distance: 6.845135447338132 vec: [32.000, 45.000, 15.000, 64.000, 2.000] Key: 100: Value: wt: 1.0 distance: 10.234304921679705 vec: [31.000, 49.000, 15.000, 69.000, 2.000] Key: 128: Value: wt: 1.0 distance: 7.318849411742204 vec: [23.000, 40.000, 13.000, 55.000, 2.000] Key: 128: Value: wt: 1.0 distance: 6.3893365248584155 vec: [28.000, 46.000, 15.000, 65.000, 2.000] Key: 128: Value: wt: 1.0 distance: 2.7032373546600645 vec: [28.000, 45.000, 13.000, 57.000, 2.000] Key: 128: Value: wt: 1.0 distance: 7.648597295107581 vec: [33.000, 47.000, 16.000, 63.000, 2.000] Key: 128: Value: wt: 1.0 distance: 15.840467020307052 vec: [24.000, 33.000, 10.000, 49.000, 2.000] : :
: :
Count: 150
And for the seeds data
Input Path: hdfs://childnode1:8020/user/Masternode/seeds/kmeans_out/clusteredPoints/part-m-00000 Key class: class org.apache.hadoop.io.IntWritable Value Class: class org.apache.mahout.clustering.classify.WeightedPropertyVectorWritable Key: 24: Value: wt: 1.0 distance: 861.9071668766143 vec: [14840.000, 15260.000, 5763.000, 3312.000, 2221.000, 5220.000, 871.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1668.6900610565315 vec: [14570.000, 14880.000, 5554.000, 3333.000, 1018.000, 4956.000, 881.000, 1.000] Key: 24: Value: wt: 1.0 distance: 694.0022557455687 vec: [14090.000, 14290.000, 5291.000, 3337.000, 2699.000, 4825.000, 905.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1137.750249826391 vec: [13940.000, 13840.000, 5324.000, 3379.000, 2259.000, 4805.000, 896.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2066.340415883421 vec: [14990.000, 16140.000, 5658.000, 3562.000, 1355.000, 5175.000, 903.000, 1.000] Key: 24: Value: wt: 1.0 distance: 508.48008032866846 vec: [14210.000, 14380.000, 5386.000, 3312.000, 2462.000, 4956.000, 895.000, 1.000] Key: 24: Value: wt: 1.0 distance: 939.0255545463335 vec: [14490.000, 14690.000, 5563.000, 3259.000, 3586.000, 5219.000, 880.000, 1.000] Key: 24: Value: wt: 1.0 distance: 693.4040772257339 vec: [14100.000, 14110.000, 5420.000, 3302.000, 2700.000, 5000.000, 891.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2457.5552311826623 vec: [15460.000, 16630.000, 6053.000, 3465.000, 2040.000, 5877.000, 875.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2136.715539711948 vec: [15250.000, 16440.000, 5884.000, 3505.000, 1969.000, 5533.000, 888.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2037.64908599264 vec: [14850.000, 15260.000, 5714.000, 3242.000, 4543.000, 5314.000, 870.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1183.0427661481044 vec: [14160.000, 14030.000, 5438.000, 3201.000, 1717.000, 5001.000, 880.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1668.9093470760495 vec: [14020.000, 13890.000, 5439.000, 3199.000, 3986.000, 4738.000, 888.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1129.8172629441638 vec: [14060.000, 13780.000, 5479.000, 3156.000, 3136.000, 4872.000, 876.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1114.6345778285634 vec: [14050.000, 13740.000, 5482.000, 3114.000, 2932.000, 4825.000, 874.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1616.574316222106 vec: [14280.000, 14590.000, 5351.000, 3333.000, 4185.000, 4781.000, 899.000, 1.000] Key: 177: Value: wt: 1.0 distance: 2237.7216511574297 vec: [13830.000, 13990.000, 5119.000, 3383.000, 5234.000, 4781.000, 918.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1533.017981297028 vec: [14750.000, 15690.000, 5527.000, 3514.000, 1599.000, 5046.000, 906.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1141.7747846041293 vec: [14210.000, 14700.000, 5205.000, 3466.000, 1767.000, 4649.000, 915.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1073.5112523108057 vec: [13570.000, 12720.000, 5226.000, 3049.000, 4102.000, 4914.000, 869.000, 1.000] Key: 24: Value: wt: 1.0 distance: 673.1032757822736 vec: [14400.000, 14160.000, 5658.000, 3129.000, 3072.000, 5176.000, 858.000, 1.000] Key: 24: Value: wt: 1.0 distance: 588.5794743674002 vec: [14260.000, 14110.000, 5520.000, 3168.000, 2688.000, 5219.000, 872.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2307.6221450474227 vec: [14900.000, 15880.000, 5618.000, 3507.000, 765.000, 5091.000, 899.000, 1.000] Key: 24: Value: wt: 1.0 distance: 3165.469289919653 vec: [13230.000, 12080.000, 5099.000, 2936.000, 1415.000, 4961.000, 866.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1022.3111169642691 vec: [14760.000, 15010.000, 5789.000, 3245.000, 1791.000, 5001.000, 866.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2453.594725178613 vec: [15160.000, 16190.000, 5833.000, 3421.000, 903.000, 5307.000, 885.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1842.0639979312034 vec: [13760.000, 13020.000, 5395.000, 3026.000, 3373.000, 4825.000, 864.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2127.2694725299393 vec: [13670.000, 12740.000, 5395.000, 2956.000, 2504.000, 4869.000, 856.000, 1.000] Key: 24: Value: wt: 1.0 distance: 638.121303238346 vec: [14180.000, 14110.000, 5541.000, 3221.000, 2754.000, 5038.000, 882.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1570.180781482017 vec: [14020.000, 13450.000, 5516.000, 3065.000, 3531.000, 5097.000, 860.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2442.635062314189 vec: [13820.000, 13160.000, 5454.000, 2975.000, 855.000, 5056.000, 866.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1252.115173918792 vec: [14940.000, 15490.000, 5757.000, 3371.000, 3412.000, 5228.000, 872.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1405.0316200008115 vec: [14410.000, 14090.000, 5717.000, 3186.000, 3920.000, 5299.000, 853.000, 1.000] Key: 24: Value: wt: 1.0 distance: 954.5724423251518 vec: [14170.000, 13940.000, 5585.000, 3150.000, 2124.000, 5012.000, 873.000, 1.000] Key: 24: Value: wt: 1.0 distance: 729.6383182264915 vec: [14680.000, 15050.000, 5712.000, 3328.000, 2129.000, 5360.000, 878.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1655.6627215851186 vec: [15000.000, 16120.000, 5709.000, 3485.000, 2270.000, 5443.000, 1000.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1818.907545412335 vec: [15270.000, 16200.000, 5826.000, 3464.000, 2823.000, 5527.000, 873.000, 1.000] Key: 76: Value: wt: 1.0 distance: 2107.009330735528 vec: [15380.000, 17080.000, 5832.000, 3683.000, 2956.000, 5484.000, 908.000, 1.000] Key: 24: Value: wt: 1.0 distance: 512.998557374996 vec: [14520.000, 14800.000, 5656.000, 3288.000, 3112.000, 5309.000, 882.000, 1.000] Key: 177: Value: wt: 1.0 distance: 3176.186184965435 vec: [14170.000, 14280.000, 5397.000, 3298.000, 6685.000, 5001.000, 894.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1291.054421730103 vec: [13850.000, 13540.000, 5348.000, 3156.000, 2587.000, 5178.000, 887.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1382.5623224377814 vec: [13850.000, 13500.000, 5351.000, 3158.000, 2249.000, 5176.000, 885.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1853.3448494492047 vec: [13550.000, 13160.000, 5138.000, 3201.000, 2461.000, 4783.000, 901.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2315.520560114591 vec: [14860.000, 15500.000, 5877.000, 3396.000, 4711.000, 5528.000, 882.000, 1.000] Key: 24: Value: wt: 1.0 distance: 695.316069521979 vec: [14540.000, 15110.000, 5579.000, 3462.000, 3128.000, 5180.000, 899.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1478.614814571006 vec: [14040.000, 13800.000, 5376.000, 3155.000, 1560.000, 4961.000, 879.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1508.3478794444047 vec: [14760.000, 15360.000, 5701.000, 3393.000, 1367.000, 5132.000, 886.000, 1.000] Key: 24: Value: wt: 1.0 distance: 481.97134871269895 vec: [14560.000, 14990.000, 5570.000, 3377.000, 2958.000, 5175.000, 888.000, 1.000] Key: 24: Value: wt: 1.0 distance: 183.71730361238392 vec: [14520.000, 14790.000, 5545.000, 3291.000, 2704.000, 5111.000, 882.000, 1.000] Key: 24: Value: wt: 1.0 distance: 630.6956414080009 vec: [14670.000, 14860.000, 5678.000, 3258.000, 2129.000, 5351.000, 868.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1346.3622667032168 vec: [14400.000, 14430.000, 5585.000, 3272.000, 3975.000, 5144.000, 875.000, 1.000] Key: 24: Value: wt: 1.0 distance: 3192.1534037702677 vec: [14910.000, 15780.000, 5674.000, 3434.000, 5593.000, 5136.000, 892.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1513.9463214768837 vec: [14610.000, 14490.000, 5715.000, 3113.000, 4116.000, 5396.000, 854.000, 1.000] Key: 24: Value: wt: 1.0 distance: 778.4078250448471 vec: [14280.000, 14330.000, 5504.000, 3199.000, 3328.000, 5224.000, 883.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1243.4189505293368 vec: [14600.000, 14520.000, 5741.000, 3113.000, 1481.000, 5487.000, 856.000, 1.000] Key: 24: Value: wt: 1.0 distance: 915.6820426338833 vec: [14770.000, 15030.000, 5702.000, 3212.000, 1933.000, 5439.000, 866.000, 1.000] Key: 24: Value: wt: 1.0 distance: 365.8725763036213 vec: [14350.000, 14460.000, 5388.000, 3377.000, 2802.000, 5044.000, 882.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1551.4815116461734 vec: [14430.000, 14920.000, 5384.000, 3412.000, 1142.000, 5088.000, 901.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1041.0816986203454 vec: [14770.000, 15380.000, 5662.000, 3419.000, 1999.000, 5222.000, 886.000, 1.000] Key: 24: Value: wt: 1.0 distance: 3069.1391786047893 vec: [13470.000, 12110.000, 5159.000, 3032.000, 1502.000, 4519.000, 839.000, 1.000] Key: 177: Value: wt: 1.0 distance: 2234.4099731961946 vec: [12860.000, 11420.000, 5008.000, 2850.000, 2700.000, 4607.000, 868.000, 1.000] Key: 177: Value: wt: 1.0 distance: 2723.181787028119 vec: [12630.000, 11230.000, 4902.000, 2879.000, 2269.000, 4703.000, 884.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1679.8632384057453 vec: [13190.000, 12360.000, 5076.000, 3042.000, 3220.000, 4605.000, 892.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1525.05309032792 vec: [13840.000, 13220.000, 5395.000, 3070.000, 4157.000, 5088.000, 868.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2603.1739906169455 vec: [13570.000, 12780.000, 5262.000, 3026.000, 1176.000, 4782.000, 872.000, 1.000] Key: 24: Value: wt: 1.0 distance: 2164.8100360288104 vec: [13500.000, 12880.000, 5139.000, 3119.000, 2352.000, 4607.000, 888.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1379.1305505369796 vec: [14370.000, 14340.000, 5630.000, 3190.000, 1313.000, 5150.000, 873.000, 1.000] Key: 24: Value: wt: 1.0 distance: 802.2824508737174 vec: [14290.000, 14010.000, 5609.000, 3158.000, 2217.000, 5132.000, 862.000, 1.000] Key: 24: Value: wt: 1.0 distance: 1230.3844556714478 vec: [14390.000, 14370.000, 5569.000, 3153.000, 1464.000, 5300.000, 873.000, 1.000] Key: 177: Value: wt: 1.0 distance: 1533.2295745642984 vec: [13750.000, 12730.000, 5412.000, 2882.000, 3533.000, 5067.000, 846.000, 1.000] Key: 76: Value: wt: 1.0 distance: 1242.0782987132768 vec: [15980.000, 17630.000, 6191.000, 3561.000, 4076.000, 6060.000, 867.000, 2.000] Key: 76: Value: wt: 1.0 distance: 2284.8314552331644 vec: [15670.000, 16840.000, 5998.000, 3484.000, 4675.000, 5877.000, 862.000, 2.000] Key: 76: Value: wt: 1.0 distance: 1865.3220945799494 vec: [15730.000, 17260.000, 5978.000, 3594.000, 4539.000, 5791.000, 876.000, 2.000] Key: 76: Value: wt: 1.0 distance: 802.7846465017649 vec: [16260.000, 19110.000, 6154.000, 3930.000, 2936.000, 6079.000, 908.000, 2.000] Key: 76: Value: wt: 1.0 distance: 2130.8931341845782 vec: [15510.000, 16820.000, 6017.000, 3486.000, 4004.000, 5841.000, 879.000, 2.000] Key: 76: Value: wt: 1.0 distance: 2497.1535713990975 vec: [15620.000, 16770.000, 5927.000, 3438.000, 4920.000, 5795.000, 864.000, 2.000] Key: 76: Value: wt: 1.0 distance: 1519.304200689462 vec: [15910.000, 17320.000, 6064.000, 3403.000, 3824.000, 5922.000, 860.000, 2.000] Key: 76: Value: wt: 1.0 distance: 2415.437198684845 vec: [17230.000, 20710.000, 6579.000, 3814.000, 4451.000, 6451.000, 876.000, 2.000] Key: 76: Value: wt: 1.0 distance: 1538.7977024244756 vec: [16490.000, 18940.000, 6445.000, 3639.000, 5064.000, 6362.000, 875.000, 2.000] Key: 76: Value: wt: 1.0 distance: 1983.9629595142953 vec: [15550.000, 17120.000, 5850.000, 3566.000, 2858.000, 5746.000, 889.000, 2.000] : :
: :
Count: 210
And for the mahout clusterdump output, ie the clusters:
For the iris data
VL-128{n=62 c=[27.484, 43.935, 14.339, 59.016, 2.226] r=[2.939, 5.048, 2.951, 4.626, 0.418]} Weight : [props - optional]: Point: 1.0 : [distance=12.27183013149531]: [32.000, 47.000, 14.000, 70.000, 2.000] 1.0 : [distance=6.845135447338132]: [32.000, 45.000, 15.000, 64.000, 2.000] 1.0 : [distance=7.318849411742204]: [23.000, 40.000, 13.000, 55.000, 2.000] : : : : VL-4{n=50 c=[34.180, 14.640, 2.440, 50.060, 1.000] r=[0:3.772, 1:1.718, 2:1.061, 3:3.489]} Weight : [props - optional]: Point: 1.0 : [distance=1.4694216549377501]: [35.000, 14.000, 2.000, 51.000, 1.000] 1.0 : [distance=4.381689171997383]: [30.000, 14.000, 2.000, 49.000, 1.000] 1.0 : [distance=4.123008610226189]: [32.000, 13.000, 2.000, 47.000, 1.000] : : : : VL-100{n=38 c=[30.737, 57.421, 20.711, 68.500, 2.947] r=[2.862, 4.821, 2.762, 4.876, 0.223]} Weight : [props - optional]: Point: 1.0 : [distance=10.234304921679705]: [31.000, 49.000, 15.000, 69.000, 2.000] 1.0 : [distance=8.516482308683988]: [30.000, 50.000, 17.000, 67.000, 2.000] 1.0 : [distance=7.773365278708711]: [33.000, 60.000, 25.000, 63.000, 3.000] : : : :
And for the seeds data
VL-76{n=61 c=[16297.377, 18721.803, 6208.934, 3722.672, 3603.590, 6066.098, 885.115, 1.984] r=[470.329, 1087.056, 218.340, 150.079, 1222.928, 222.042, 14.901, 0.127]} Weight : [props - optional]: Point: 1.0 : [distance=2107.009330735528]: [15380.000, 17080.000, 5832.000, 3683.000, 2956.000, 5484.000, 908.000, 1.000] 1.0 : [distance=1242.0782987132768]: [15980.000, 17630.000, 6191.000, 3561.000, 4076.000, 6060.000, 867.000, 2.000] 1.0 : [distance=2284.8314552331644]: [15670.000, 16840.000, 5998.000, 3484.000, 4675.000, 5877.000, 862.000, 2.000] : : : : VL-177{n=77 c=[13274.805, 11964.416, 5229.286, 2872.922, 4759.740, 5088.519, 852.208, 2.766] r=[370.481, 808.956, 141.698, 162.027, 1292.441, 182.253, 22.964, 0.643]} Weight : [props - optional]: Point: 1.0 : [distance=2237.7216511574297]: [13830.000, 13990.000, 5119.000, 3383.000, 5234.000, 4781.000, 918.000, 1.000] 1.0 : [distance=1073.5112523108057]: [13570.000, 12720.000, 5226.000, 3049.000, 4102.000, 4914.000, 869.000, 1.000] 1.0 : [distance=1842.0639979312034]: [13760.000, 13020.000, 5395.000, 3026.000, 3373.000, 4825.000, 864.000, 1.000] : : : : VL-24{n=72 c=[14460.417, 14648.472, 5563.778, 3277.903, 2648.931, 5192.319, 880.556, 1.194] r=[531.885, 1108.561, 218.877, 158.321, 1093.229, 318.248, 21.182, 0.461]} Weight : [props - optional]: Point: 1.0 : [distance=861.9071668766143]: [14840.000, 15260.000, 5763.000, 3312.000, 2221.000, 5220.000, 871.000, 1.000] 1.0 : [distance=1668.6900610565315]: [14570.000, 14880.000, 5554.000, 3333.000, 1018.000, 4956.000, 881.000, 1.000] 1.0 : [distance=694.0022557455687]: [14090.000, 14290.000, 5291.000, 3337.000, 2699.000, 4825.000, 905.000, 1.000] : : : :
So the mahout arff.vector command works fine. Always !!!!
Sorry for the delay in replying !