Support Questions

spserd · ‎02-10-2025

Hello all - I'm trying to run the below spark-shell command from the bin directory of spark 3.4.3 extracted location. I specified the master as my Kubernetes environment as I'd like my executors to run on the k8s environment.

I created a service account with all necessary permissions.

# kubectl auth can-i create pod --as=system:serviceaccount:my-namespace:spark-sa -n my-namespace                                                         

yes

Command:

# export K8S_TOKEN=$(kubectl get secrets -o jsonpath="{.items[?(@.metadata.annotations['kubernetes\.io/service-account\.name']=='spark-sa')].data.token}"|base64 --decode)

# ./bin/spark-shell \
--master k8s://https://my-k8s-cluster:6443 \ 
--deploy-mode client \ 
--name spark-shell-poc \ 
--conf spark.executor.instances=1 \ 
--conf spark.kubernetes.container.image=my-docker-hub/spark_poc:v1.4 \ 
--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \ 
--conf spark.kubernetes.namespace=dynx-center-resources \ 
--conf spark.driver.pod.name=dynx-spark-driver \ 
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \ 
--conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN

Even though I specified the service account to use and its token, it always ends up in using 'system:anonymous' user to create pods in my k8s environment and because of that I get the below error (snippet from a huge stack trace).

25/02/06 14:36:32 WARN ExecutorPodsSnapshotsStoreImpl: Exception when notifying snapshot subscriber.
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://my-k8s-cluster:6443/api/v1/namespaces/dynx-center-resources/pods. Message: pods is forbidden: User "system:anonymous" cannot create resource "pods" in API group "" in the namespace "dynx-center-resources". Received status: Status(apiVersion=v1, code=403, details=StatusDetails(causes=[], group=null, kind=pods, name=null, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=pods is forbidden: User "system:anonymous" cannot create resource "pods" in API group "" in the namespace "dynx-center-resources", metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Forbidden, status=Failure, additionalProperties={}).
at io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238)
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:538)
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:558)
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:349)
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:711)
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:93)
at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42)
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1113)
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:93)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:440)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:417)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$36(ExecutorPodsAllocator.scala:370)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$36$adapted(ExecutorPodsAllocator.scala:363)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:363)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3(ExecutorPodsAllocator.scala:134)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3$adapted(ExecutorPodsAllocator.scala:134)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.org$apache$spark$scheduler$cluster$k8s$ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber$$processSnapshotsInternal(ExecutorPodsSnapshotsStoreImpl.scala:143)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.processSnapshots(ExecutorPodsSnapshotsStoreImpl.scala:131)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl.$anonfun$addSubscriber$1(ExecutorPodsSnapshotsStoreImpl.scala:85)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)

As a part of troubleshooting, I tried running the below curl command using the same service account user token and got the results.

curl -X GET https://my-k8s-cluster:6443/api --header "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IjdWOXgwTjdIeUdCTGx2eEItOXZ3eDlSV1I1UXd1d0MtTXJENFBhXzNDTTgifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkeW54LWNlbnRlci1yZXNvdXJjZXMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlY3JldC5uYW1lIjoic3Bhcmstc2EiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoic3Bhcmstc2EiLCJrdWJlcm5ldGVzxxxxxxxxxxxxxxxxxxxxxmMS03NzI5LTQ5OTAtYWZkOC1mYjZiNzU4ZDg5YzAiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6ZHlueC1jZW50ZXItcmVzb3VyY2VzOnNwYXJrLXNhIn0.TWAQYmu_N-N1gnZ1hYYn_wvavs9f9w33v0P0Kgchd1eETO8TpHlYS_JSt8jzWlX6C4JF293Q8VRk8p1Nx3zRdqjZnYWmMvJYCaq5mBAyvXAW8fXW_ZtQD7HJPUEUb2ZDXUz3b2XLgvJoWui8vhqZBYUev67YgHHRspgkwDbLrRIB1oRPbx_2osYMQW3tPxoThyzUqdvyBij3hjW-syrsp_sR1ir-78XzIZpkV2OBFds7u8vd0IqoWLOtmnZwdq1RKCKtFk292VfWSbN0HYJUs_aJUeaqLpekopZLfDM2U_GT0ImwBUOL2EILpb-K1xdWr4-Jv4qPsFBLFh31S2OMAg" --insecure
{
 "kind": "APIVersions",
 "versions": [
   "v1"
 ],
 "serverAddressByClientCIDRs": [
   {
     "clientCIDR": "0.0.0.0/0",
     "serverAddress": "10.14.3.19:6443"
   }
 ]
}%

However, if I run the spark-submit command via cluster deploy mode, it runs without any issue and produce the desired output.

# ./bin/spark-submit \
--master k8s://https://my-k8s-cluster:6443 \
--deploy-mode cluster \
--name spark-poc \
--conf spark.executor.instances=2 \
--conf spark.kubernetes.container.image=my-docker-hub/spark_poc:v1.4 \
--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \
--conf spark.kubernetes.namespace=dynx-center-resources \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \
--conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN \
--class org.apache.spark.examples.SparkPi \
local:///opt/spark/examples/jars/spark-examples_2.12-3.4.3.jar 1000

Not sure what I'm missing. Appreciate any help on this.

Shelton · ‎03-23-2025

@spserd
Looking at your issue with Spark on Kubernetes, I see a clear difference between the client and cluster deployment modes that's causing the "system" authentication problem. the issue is when running in client mode with spark-shell, you're encountering an authorization issue where Spark is trying to create executor pods as "system" instead of using your service account "spark-sa", despite providing the token.

Possible Solution
For client mode, you need to add a specific configuration to tell Spark to use the token for executor pod creation

Spoiler

--conf spark.kubernetes.authenticate.executor.serviceAccountName=spark-sa

So your updated command should look like this

Spoiler

./bin/spark-shell \
--master k8s://https://my-k8s-cluster:6443 \
--deploy-mode client \
--name spark-shell-poc \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.container.image=my-docker-hub/spark_poc:v1.4 \
--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \
--conf spark.kubernetes.namespace=dynx-center-resources \
--conf spark.driver.pod.name=dynx-spark-driver \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \
--conf spark.kubernetes.authenticate.executor.serviceAccountName=spark-sa \
--conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN

./bin/spark-shell \--master k8s://https://my-k8s-cluster:6443 \--deploy-mode client \--name spark-shell-poc \--conf spark.executor.instances=1 \--conf spark.kubernetes.container.image=my-docker-hub/spark_poc:v1.4 \--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \--conf spark.kubernetes.namespace=dynx-center-resources \--conf spark.driver.pod.name=dynx-spark-driver \--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \--conf spark.kubernetes.authenticate.executor.serviceAccountName=spark-sa \--conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN

The key is that in client mode, you need to explicitly configure the executor authentication because the driver is running outside the cluster and needs to delegate this permission.

If this still doesn't work, ensure your service account has appropriate ClusterRole bindings that allow it to create and manage pods in the specified namespace.

Happy hadooping

Cloudera Community

Support Questions

Unable to run spark-shell command with k8s as master

Run Oozie Shell Action instead of Oozie Spark Acti...

spark-shell vs spark-submit command

Permission Error while running spark-shell

Unable to launch spark using spark-shell on linux

Spark Shell In Action

Spark: avoiding stack traces when starting spark-s...

Spark-shell throwing an error while connecting yar...

Unable to run [DFS] command in beeline

sqlContext not started after spark-shell command i...

Unable to run Hadoop commands