- Subscribe to RSS Feed
- Mark Question as New
- Mark Question as Read
- Float this Question for Current User
- Bookmark
- Subscribe
- Mute
- Printer Friendly Page
Unable to run spark-shell command with k8s as master
- Labels:
-
Apache Spark
Created ‎02-10-2025 12:58 PM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
Hello all - I'm trying to run the below spark-shell command from the bin directory of spark 3.4.3 extracted location. I specified the master as my Kubernetes environment as I'd like my executors to run on the k8s environment.
I created a service account with all necessary permissions.
# kubectl auth can-i create pod --as=system:serviceaccount:my-namespace:spark-sa -n my-namespace
yes
Command:
# export K8S_TOKEN=$(kubectl get secrets -o jsonpath="{.items[?(@.metadata.annotations['kubernetes\.io/service-account\.name']=='spark-sa')].data.token}"|base64 --decode)
# ./bin/spark-shell \
--master k8s://https://my-k8s-cluster:6443 \
--deploy-mode client \
--name spark-shell-poc \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.container.image=my-docker-hub/spark_poc:v1.4 \
--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \
--conf spark.kubernetes.namespace=dynx-center-resources \
--conf spark.driver.pod.name=dynx-spark-driver \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \
--conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN
Even though I specified the service account to use and its token, it always ends up in using 'system:anonymous' user to create pods in my k8s environment and because of that I get the below error (snippet from a huge stack trace).
25/02/06 14:36:32 WARN ExecutorPodsSnapshotsStoreImpl: Exception when notifying snapshot subscriber.
io.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://my-k8s-cluster:6443/api/v1/namespaces/dynx-center-resources/pods. Message: pods is forbidden: User "system:anonymous" cannot create resource "pods" in API group "" in the namespace "dynx-center-resources". Received status: Status(apiVersion=v1, code=403, details=StatusDetails(causes=[], group=null, kind=pods, name=null, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=pods is forbidden: User "system:anonymous" cannot create resource "pods" in API group "" in the namespace "dynx-center-resources", metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=Forbidden, status=Failure, additionalProperties={}).
at io.fabric8.kubernetes.client.KubernetesClientException.copyAsCause(KubernetesClientException.java:238)
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:538)
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:558)
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleCreate(OperationSupport.java:349)
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:711)
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handleCreate(BaseOperation.java:93)
at io.fabric8.kubernetes.client.dsl.internal.CreateOnlyResourceOperation.create(CreateOnlyResourceOperation.java:42)
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:1113)
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.create(BaseOperation.java:93)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$requestNewExecutors$1(ExecutorPodsAllocator.scala:440)
at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:158)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.requestNewExecutors(ExecutorPodsAllocator.scala:417)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$36(ExecutorPodsAllocator.scala:370)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$onNewSnapshots$36$adapted(ExecutorPodsAllocator.scala:363)
at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.onNewSnapshots(ExecutorPodsAllocator.scala:363)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3(ExecutorPodsAllocator.scala:134)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsAllocator.$anonfun$start$3$adapted(ExecutorPodsAllocator.scala:134)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.org$apache$spark$scheduler$cluster$k8s$ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber$$processSnapshotsInternal(ExecutorPodsSnapshotsStoreImpl.scala:143)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl$SnapshotsSubscriber.processSnapshots(ExecutorPodsSnapshotsStoreImpl.scala:131)
at org.apache.spark.scheduler.cluster.k8s.ExecutorPodsSnapshotsStoreImpl.$anonfun$addSubscriber$1(ExecutorPodsSnapshotsStoreImpl.scala:85)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:539)
at java.base/java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305)
at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1136)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:635)
at java.base/java.lang.Thread.run(Thread.java:840)
As a part of troubleshooting, I tried running the below curl command using the same service account user token and got the results.
curl -X GET https://my-k8s-cluster:6443/api --header "Authorization: Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IjdWOXgwTjdIeUdCTGx2eEItOXZ3eDlSV1I1UXd1d0MtTXJENFBhXzNDTTgifQ.eyJpc3MiOiJrdWJlcm5ldGVzL3NlcnZpY2VhY2NvdW50Iiwia3ViZXJuZXRlcy5pby9zZXJ2aWNlYWNjb3VudC9uYW1lc3BhY2UiOiJkeW54LWNlbnRlci1yZXNvdXJjZXMiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlY3JldC5uYW1lIjoic3Bhcmstc2EiLCJrdWJlcm5ldGVzLmlvL3NlcnZpY2VhY2NvdW50L3NlcnZpY2UtYWNjb3VudC5uYW1lIjoic3Bhcmstc2EiLCJrdWJlcm5ldGVzxxxxxxxxxxxxxxxxxxxxxmMS03NzI5LTQ5OTAtYWZkOC1mYjZiNzU4ZDg5YzAiLCJzdWIiOiJzeXN0ZW06c2VydmljZWFjY291bnQ6ZHlueC1jZW50ZXItcmVzb3VyY2VzOnNwYXJrLXNhIn0.TWAQYmu_N-N1gnZ1hYYn_wvavs9f9w33v0P0Kgchd1eETO8TpHlYS_JSt8jzWlX6C4JF293Q8VRk8p1Nx3zRdqjZnYWmMvJYCaq5mBAyvXAW8fXW_ZtQD7HJPUEUb2ZDXUz3b2XLgvJoWui8vhqZBYUev67YgHHRspgkwDbLrRIB1oRPbx_2osYMQW3tPxoThyzUqdvyBij3hjW-syrsp_sR1ir-78XzIZpkV2OBFds7u8vd0IqoWLOtmnZwdq1RKCKtFk292VfWSbN0HYJUs_aJUeaqLpekopZLfDM2U_GT0ImwBUOL2EILpb-K1xdWr4-Jv4qPsFBLFh31S2OMAg" --insecure
{
"kind": "APIVersions",
"versions": [
"v1"
],
"serverAddressByClientCIDRs": [
{
"clientCIDR": "0.0.0.0/0",
"serverAddress": "10.14.3.19:6443"
}
]
}%
However, if I run the spark-submit command via cluster deploy mode, it runs without any issue and produce the desired output.
# ./bin/spark-submit \
--master k8s://https://my-k8s-cluster:6443 \
--deploy-mode cluster \
--name spark-poc \
--conf spark.executor.instances=2 \
--conf spark.kubernetes.container.image=my-docker-hub/spark_poc:v1.4 \
--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \
--conf spark.kubernetes.namespace=dynx-center-resources \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \
--conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN \
--class org.apache.spark.examples.SparkPi \
local:///opt/spark/examples/jars/spark-examples_2.12-3.4.3.jar 1000
Not sure what I'm missing. Appreciate any help on this.
Created ‎03-23-2025 04:42 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
@spserd
Looking at your issue with Spark on Kubernetes, I see a clear difference between the client and cluster deployment modes that's causing the "system" authentication problem. the issue is when running in client mode with spark-shell, you're encountering an authorization issue where Spark is trying to create executor pods as "system" instead of using your service account "spark-sa", despite providing the token.
Possible Solution
For client mode, you need to add a specific configuration to tell Spark to use the token for executor pod creation
So your updated command should look like this
--master k8s://https://my-k8s-cluster:6443 \
--deploy-mode client \
--name spark-shell-poc \
--conf spark.executor.instances=1 \
--conf spark.kubernetes.container.image=my-docker-hub/spark_poc:v1.4 \
--conf spark.kubernetes.container.image.pullPolicy=IfNotPresent \
--conf spark.kubernetes.namespace=dynx-center-resources \
--conf spark.driver.pod.name=dynx-spark-driver \
--conf spark.kubernetes.authenticate.driver.serviceAccountName=spark-sa \
--conf spark.kubernetes.authenticate.executor.serviceAccountName=spark-sa \
--conf spark.kubernetes.authenticate.submission.oauthToken=$K8S_TOKEN
The key is that in client mode, you need to explicitly configure the executor authentication because the driver is running outside the cluster and needs to delegate this permission.
If this still doesn't work, ensure your service account has appropriate ClusterRole bindings that allow it to create and manage pods in the specified namespace.
Happy hadooping
