About RonyA

RonyA · ‎08-12-2021

Just adding that I double checked with reduced excutors number and memory: .config("spark.executor.memory", "3G") .config("spark.num.executors", "5") but the excutor still starts only 2 exceutors.

RonyA · ‎08-12-2021

Thanks for the detailed follow up. Your suggestion that I'm not getting the executors requested as the "cluster is not capable of providing the requested 10 executors of 8GB" is an option my client should check. I will forward your suggestion to them so they'll be able to discuss further in the case they'll open with Cloudera (I'm not their data engineer).

RonyA · ‎08-12-2021

My client is a Cloudera customer. I will let him know. Many thanks for your help

RonyA · ‎08-10-2021

I've tried the configuration you provided with dynamic allocation enabled. The UI timeline still shows the start of only 2 excutors while the environment tab num.excutors=10 as set. The application is set up to run on datasets of diverse sizes but I was able to reduce the load for calculating the large ones by changes to the code. The calculations involve shuffle operaions (as reducedByKey) but since the dataset size in not fixed I don't see how I can use a fixed estimate shuffle input size in calculating spark.sql.shuffle.partitions. Thanks again for the tips but please let me know how can I coerece the set up of a specific number of executors in the cluster or which internal configuration I should look into to fix this issue.

RonyA · ‎08-01-2021

Thanks for the detailed reply. The issue was encountered by other colleagues and I encountered it only lately. I will forward the reply to my colleagues and will test the configuration proposed once I get back to the office. As prior configurations worked well, following intensive tests on datasets of various kinds, I prefer not to apply dynamic allocation if that is not absolutely necessary. To my understanding, yarn should let the users define the number of executors and build the cluster accordingly. I'll return with more info once we tested the configuration you proposed.

RonyA · ‎07-29-2021

Hello, I need your advice regarding what seems like a strange behavior in the cluster I'm using. Running Spark (2.4, Cloudera) on Yarn the configuration calls the set up of 10 executors: spark = (SparkSession .builder.master("yarn") .config("spark.executor.cores", "12") .config("spark.executor.memory", "8G") .config("spark.num.executors", "10") .config("spark.driver.memory", "6G") .config("spark.yarn.am.memoryOverhead", "6G") .config("spark.executor.memoryOverhead", "5G") .config("spark.driver.memoryOverhead", "5G") .config("spark.sql.hive.convertMetastoreOrc", "true") .config("spark.executor.heartbeatInterval", "60s") .config("spark.network.timeout", "600s") .config("spark.driver.maxResultSize", "2g") .config("spark.driver.cores","4") .config("spark.executor.extraClassPath", "-Dhdp.version=current") .config("spark.debug.maxToStringFields", 200) .config("spark.sql.catalogImplementation", "hive") .config("spark.memory.fraction", "0.8") .config("spark.memory.storageFraction", "0.2") .config("spark.sql.hive.filesourcePartitionFileCacheSize", "0") .config("spark.yarn.maxAppAttempts", "10") .appName(app_name) .enableHiveSupport().getOrCreate()) However, the UI/ yarn logs show the start of only 2 executors, with a third starting at a later stage, to replace the second: 21/07/29 10:44:01 INFO Executor: Starting executor ID 3 on host <node name> 21/07/29 10:43:54 WARN YarnAllocator: Container killed by YARN for exceeding memory limits. 10.0 GB of 10 GB physical memory used. Consider boosting spark.yarn.executor.memoryOverhead. The memory issues wouldn't have emerge had the application set the number of executors requested. Can you think of any reason why the cluster may "choose" to reduce the number of executors to set up? Any internal yarn/other configuration I should examine?

Online	Offline
Last Visited	‎08-12-2021 07:40 AM

Member Since	‎09-10-2019 11:53 PM
Last Visited	‎08-12-2021 07:40 AM
Posts	7

Cloudera Community

Re: Spark cluster: Launched executors less than sp...

Re: Spark cluster: Launched executors less than sp...

Re: Spark cluster: Launched executors less than sp...

Re: Spark cluster: Launched executors less than sp...

Re: Spark cluster: Launched executors less than sp...

Re: Spark cluster: Launched executors less than sp...

Spark cluster: Launched executors less than specif...