Member since
06-25-2020
1
Post
0
Kudos Received
0
Solutions
06-25-2020
12:06 PM
I tried to fit a random forest classifier in pyspark but i'm getting this error: Py4JJavaError: An error occurred while calling o767.fit. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 30.0 failed 1 times, most recent failure: Lost task 0.0 in stage 30.0 (TID 853, localhost, executor driver): java.lang.OutOfMemoryError: Java heap space Can anyone help me please My code : from pyspark.ml.tuning import ParamGridBuilderrf = RandomForestClassifier(labelCol="label", featuresCol="features") paramGrid = (ParamGridBuilder()
.addGrid(rf.numTrees, [100])
.build()) crossval = CrossValidator(estimator=rf, estimatorParamMaps=paramGrid, evaluator=BinaryClassificationEvaluator(), numFolds=10) cvModel = crossval.fit(trainingData) predictions = crossval.transform(testData) predictions.printSchema()
... View more
Labels:
- Labels:
-
Apache Spark