About Ryan_2002

Ryan_2002 · ‎01-16-2023

I'm using Machine Learning Workspace in Cloudera Data Platform (CDP). I created a session with 4vCPU/16 GiB Memory and enabled Spark 3.2.0. I'm using spark to load data of one month (the whole month data size is around 12 GB) and do some transformation, then write the data as parquet files on AWS S3. My Spark session configuration looks like this: SparkSession .builder .appName(appName) .config("spark.driver.memory", "8G") .config("spark.dynamicAllocation.enabled", "true") .config("spark.dynamicAllocation.minExecutors", "4") .config("spark.dynamicAllocation.maxExecutors", "20") .config("spark.executor.cores", "4") .config("spark.executor.memory", "8G") .config("spark.sql.shuffle.partitions", 500) ...... Before the data are written to parquet files, they are repartitioned: df.withColumn("salt", math.floor(rand() * 100)) .repartition("date_year", "date_month", "date_day", "salt") .drop("salt").write.partitionBy("date_year", "date_month") .mode("overwrite").parquet(SOME__PATH) The data transformation with spark run sucessfully. But the spark job failed always in the last step when writing data to parquet files. Below is the example of the error message: 23/01/15 21:10:59 678 ERROR TaskSchedulerImpl: Lost executor 2 on 100.100.18.155: The executor with id 2 exited with exit code -1(unexpected). The API gave the following brief reason: Evicted The API gave the following message: Pod ephemeral local storage usage exceeds the total limit of containers 10Gi. I think there is no problem with my spark configuration. The problem is the configuration of kubenete ephemeral local storage size limitation, which I do not have the right to change it. Can some one explain why this happened and what is is possbile solution for it?

Ryan_2002 · ‎01-13-2023

Hi Smarak, thanks for your answer. That helps me!

Ryan_2002 · ‎01-12-2023

In CDP Public Cloud Machine Learning, we can create a new session with reserved resource, for example 4vCPU and 16 GiB Memory. We can also create spark session inside the machine learning workbench with some memory configuration. For example: spark = (SparkSession.builder.appName(appName).config("spark.driver.memory", "16G").config("spark.executor.instances", "10").config("spark.executor.cores", "4").config("spark.executor.memory", "20G").getOrCreate()) My question is, how will the memory be allocated to Spark session now? Is the reserved resource (4vCPU and 16 GiB Memory) in machine learning session the maximal limitation for total spark memory usage? How many work nodes and executors can I configure for the spark session?

Online	Offline
Last Visited	‎03-23-2023 10:28 AM

Member Since	‎01-12-2023 04:53 AM
Last Visited	‎03-23-2023 10:28 AM
Posts	3

Cloudera Community

Issue of container OOM when writing Dataframe to p...

Re: How is the memory allocated in CDP Machine Lea...

How is the memory allocated in CDP Machine Learnin...