Support Questions

Find answers, ask questions, and share your expertise

Type Error when attempting Linear Regression

avatar
Guru
import org.apache.spark.mllib.regression.LinearRegressionWithSGD
import org.apache.spark.mllib.regression.LabeledPoint
import org.apache.spark.mllib.linalg.Vectors
import org.apache.spark.ml.feature.{OneHotEncoder, StringIndexer}
import sqlContext.implicits._


val df = sqlContext.sql("select mnemonic, average, median, stddev from wellbook.curve_statistics")


val indexer = new StringIndexer()
  .setInputCol("mnemonic")
  .setOutputCol("mnemonicIndex")
  .fit(df)
val indexed = indexer.transform(df)


val encoder = new OneHotEncoder().setInputCol("mnemonicIndex").
  setOutputCol("mnemonicVec")
val encoded = encoder.transform(indexed)
val data = encoded.select("mnemonicVec", "average", "median", "stddev")


val parsedData = data.map(row => LabeledPoint(row.getDouble(0), row.getAs[Vector](1)))

<console>:297: error: kinds of the type arguments (Vector) do not conform to the expected kinds of the type parameters (type T). Vector's type parameters do not match type T's expected parameters: type Vector has one type parameter, but type T has none val parsedData = data.map(row => LabeledPoint(row.getDouble(0), row.getAs[Vector](1))

1 ACCEPTED SOLUTION

avatar
Contributor

In addition to Vectors, you need to import the Spark Vector class explicitly since Scala imports its in-built Vector type by default. Try this:

import org.apache.spark.mllib.linalg.{Vector, Vectors}

View solution in original post

6 REPLIES 6

avatar
Contributor

Which version of Spark and HDP are you using?

avatar
Guru
@Dhruv Kumar

Spark1.4.1 and HDP2.3.2

avatar
Super Collaborator

Vedant, give this a shot:

val parsedData = data.map(row => LabeledPoint(row.getDouble(0), row.asInstanceOf[Vector](1)))

avatar
Guru

@Joe Widen I tried it earlier and gave me the same error.

avatar
Contributor

In addition to Vectors, you need to import the Spark Vector class explicitly since Scala imports its in-built Vector type by default. Try this:

import org.apache.spark.mllib.linalg.{Vector, Vectors}

avatar
Guru

@Dhruv Kumar Thanks it worked.