Support Questions
Find answers, ask questions, and share your expertise
Announcements
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

Spark - machine learning : Type mismatch while transform

Solved Go to solution
Highlighted

Spark - machine learning : Type mismatch while transform

Rising Star

I facing error while transform (tokenizer.transform) - Please advice

-----------------------------------------------------------------------------------------------------------------------------------------------------------

import org.apache.spark.ml.feature.{HashingTF, IDF, Tokenizer}

val sentenceData = sqlContext.createDataFrame(Seq( (0, "Hi I heard about Spark"), (0, "I wish Java could use case classes"), (1, "Logistic regression models are neat") )).toDF("label", "sentence")

val tokenizer = new Tokenizer().setInputCol("sentence").setOutputCol("words")

val wordsData = tokenizer.transform(sentenceData)

-----------------------------------------------------------------------------------------------------------------------------------------------------------

Error message for reference -->

import org.apache.spark.ml.feature

sentenceData: org.apache.spark.sql.DataFrame = [label: int, sentence: string] tokenizer: org.apache.spark.ml.feature.Tokenizer = tok_6ac8a05b403d

<console>:61: error: type mismatch; found : org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.DataFrame required: org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.org.apache.spark.sql.DataFrame val wordsData = tokenizer.transform(sentenceData)

1 ACCEPTED SOLUTION

Accepted Solutions
Highlighted

Re: Spark - machine learning : Type mismatch while transform

Super Guru

You have Java, Spark, installed and you are running on one of those machines.

You can restart your server.

That is out of the box Apache Spark test code.

Are you running in Shell?

Did my code run?

1.6.0 is not the best, can you run on 1.6.1 or 1.6.2.

View solution in original post

7 REPLIES 7
Highlighted

Re: Spark - machine learning : Type mismatch while transform

Super Guru

Try this.

What version of spark are you using.

http://spark.apache.org/docs/1.6.1/ml-features.html#tokenizer

import org.apache.spark.ml.feature.{RegexTokenizer, Tokenizer}

val sentenceDataFrame = sqlContext.createDataFrame(Seq(
  (0, "Hi I heard about Spark"),
  (1, "I wish Java could use case classes"),
  (2, "Logistic,regression,models,are,neat")
)).toDF("label", "sentence")

val tokenizer = new Tokenizer().setInputCol("sentence").setOutputCol("words")
val regexTokenizer = new RegexTokenizer()
  .setInputCol("sentence")
  .setOutputCol("words")
  .setPattern("\\W") // alternatively .setPattern("\\w+").setGaps(false)

val tokenized = tokenizer.transform(sentenceDataFrame)
tokenized.select("words", "label").take(3).foreach(println)
val regexTokenized = regexTokenizer.transform(sentenceDataFrame)
regexTokenized.select("words", "label").take(3).foreach(println)


Highlighted

Re: Spark - machine learning : Type mismatch while transform

Rising Star

@Timothy Spann - I use 1.6 version

sc.version
res377: String = 1.6.0
I still face that error - Not sure why.
Highlighted

Re: Spark - machine learning : Type mismatch while transform

Super Guru

You have Java, Spark, installed and you are running on one of those machines.

You can restart your server.

That is out of the box Apache Spark test code.

Are you running in Shell?

Did my code run?

1.6.0 is not the best, can you run on 1.6.1 or 1.6.2.

View solution in original post

Highlighted

Re: Spark - machine learning : Type mismatch while transform

Rising Star
@Timothy Spann

I am working on the hortonworks sandbox 2.4 on azure environment. Currently running the program in zeppelin. Your code as well threw the same above listed error. Please advice.

Highlighted

Re: Spark - machine learning : Type mismatch while transform

Rising Star

I guess its something to do with zeppelin version. I didn't face the issue while running it in spark-shell programming. Thanks for the support as always when needed.

Re: Spark - machine learning : Type mismatch while transform

Rising Star

@Timothy Spann :: Whats missing in zeppelin version of Hortonworks sandbox 2.4 on azure causing this error.

Highlighted

Re: Spark - machine learning : Type mismatch while transform

Super Guru
Don't have an account?
Coming from Hortonworks? Activate your account here