Support Questions
Find answers, ask questions, and share your expertise

add header to correlation Matrix spark


I am applying correlation on a csv file using apache spark, when loading data i am obliged to skipe the first row as a header which are columns in the dataset otherwise i can't load the data. i get the correlation computed but when i got the correlation matrix, i can't add the columns name as a header in the new matrix please would you help me get the matrix with its header ,thanks ,this what i did

import org.apache.spark.mllib.linalg.{Vector,Vectors}
import org.apache.spark.mllib.stat.Statistics
import org.apache.spark.mllib.linalg.Matrix
import org.apache.spark.rdd.RDD

        val data = sc.textFile(strfilePath).mapPartitionsWithIndex {case(index, iterator)=>if(index ==0) iterator.drop(1)else iterator
val inputMatrix = { line =>val values = line.split(",").map(_.toDouble)Vectors.dense(values)}
val correlationMatrix=Statistics.corr(inputMatrix,"pearson")

New Contributor

@Maher Hattabi Did you find any solution??

New Contributor

Hi I found same problem....Any solution to print matrix with header/title??