The default execution engine was Map-Reduce for Hive on my cluster so far.
My Hive map reduce jobs started failing with the error discussed here. I have now switched my execution engine to Spark and it does not throw that error however,
We have constantly running Hive jobs scheduled through oozie throughout the day and also people use Hive from Hive editor in Hue.
In some of my jobs scheduled through oozie, I see this error
Error: Error while compiling statement: FAILED: SemanticException Failed to get a spark session: org.apache.hadoop.hive.ql.metadata.HiveException: Failed to create spark client. (state=42000,code=40000)
Here's my cluster configuration:
What can I do to accept any number of connections to spark? I cannot afford any of my jobs failing
Yarn containers are allowed a memory of 6GB and it has worked fine with Map Reduce.
Spark Executor Cores : 4
Spark Executor Maximum Java Heap Size: 2 GB
Spark Driver Memory Overhead:26 MiB
Spark Executor Memory Overhead: 26 MiB
My node configurations:
1 Master node with spark server on it: 16vCPU, 64GB memory
3 worker nodes with HDFS and YARN on it: 16vCPU, 64GB memory
What should be the values for above-mentioned parameters?
I am guessing it to be 6 executors and 25GB heap size with 7GB executor memory overhead
Correct me if I am wrong please
|IP1||IP1||3 Role(s)||6.6s ago||504.1 GiB / 2 TiB||11.9 GiB / 62.5 GiB|
|IP2||IP2||3 Role(s)||6.71s ago||494 GiB / 2 TiB||10.5 GiB / 62.5 GiB|
|IP3||IP3||6 Role(s)||7.41s ago||1.1 TiB / 1.4 TiB||10.7 GiB / 31 GiB|
|IP4||IP4||15 Role(s)||6.53s ago||1 TiB / 2.4 TiB||52.9 GiB / 62.5 GiB|
|IP5||IP5||2 Role(s)||14.07s ago||2.8 GiB / 2 TiB||998.7 MiB / 62.5 GiB|
YARN Node manager
|3.66s ago||927.5 GiB / 1000 GiB||6.7 GiB / 31 GiB|
|IP7||IP7||2 Role(s)||4.44s ago|
You have YARN tuning spreadsheet for Cloudera. Filling out the excel automatically helps you calculate some of the parameters like Vcores and Memory.
Set the YARN Container memory and maximum to be greater than Spark Executor Memory + Overhead. Check 'yarn.scheduler.maximum-allocation-mb' and/or 'yarn.nodemanager.resource.memory-mb'
YARN Container Memory might be smaller than the Spark Executor requirement