Support Questions
Find answers, ask questions, and share your expertise
Alert: Welcome to the Unified Cloudera Community. Former HCC members be sure to read and learn how to activate your account here.

I have issue with Scala spark CrossValidator


I have issue with Scala spark CrossValidator

New Contributor

I am using SPARK version 2.1.0.cloudera2 and get an error while running below code using Scala spark.

for simple randomforst algo it works fine , but for crossvalidation while doing parameter tuning , Iget this error.

While using Pyspark this code works perfectly in crossvalidator.


19/11/14 20:47:14 ERROR scheduler.TaskSetManager: Task 1 in stage 190.0 failed 4 times; aborting job

Name: org.apache.spark.SparkException Message: Job aborted due to stage failure: Task 1 in stage 190.0 failed 4 timesiningData)



val paramGrid = new ParamGridBuilder().addGrid(randomForestClassifier.maxBins, Array(25, 28, 31)).addGrid(randomForestClassifier.maxDepth, Array(4, 6, 8)).addGrid(randomForestClassifier.impurity, Array("entropy", "gini")) .build()
val cv = new CrossValidator().setEstimator(pipeline).setEvaluator(evaluator).setEstimatorParamMaps(paramGrid).setNumFolds(5)
val cvModel =

Don't have an account?
Coming from Hortonworks? Activate your account here