-
-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Affected Stackable version
spark operator version: 25.11.0
Affected Apache Spark-on-Kubernetes version
spark version: 3.5.7-stackable25.11.0
Current and expected behavior
提交spark任务的yaml如下
`apiVersion: spark.stackable.tech/v1alpha1
kind: SparkApplication
metadata:
name: spark-streaming
namespace: spark
spec:
sparkImage:
productVersion: 3.5.7
mode: cluster
mainClass: org.apache.spark.examples.SparkPi
mainApplicationFile: "local:///stackable/spark/examples/jars/spark-examples.jar"
args:
- "1000"
sparkConf:
spark.kubernetes.submission.waitAppCompletion: "false"
# 删除固定的 pod name,让 operator 管理
spark.kubernetes.driver.pod.name: "spark-streaming-driver"
spark.kubernetes.executor.podNamePrefix: "spark-streaming"
# --- Celeborn 配置 ---
spark.shuffle.manager: "org.apache.spark.shuffle.celeborn.SparkShuffleManager"
spark.celeborn.master.endpoints: "celeborn-master-0.celeborn-master-svc.spark.svc.cluster.local"
spark.celeborn.client.push.enabled: "true"
spark.celeborn.client.fetch.enabled: "true"
spark.kubernetes.scheduler.name: "volcano"
spark.kubernetes.driver.pod.featureSteps: "org.apache.spark.deploy.k8s.features.VolcanoFeatureStep"
spark.kubernetes.executor.pod.featureSteps: "org.apache.spark.deploy.k8s.features.VolcanoFeatureStep"
# --- 动态分配与资源 ---
spark.dynamicAllocation.enabled: "false"
spark.executor.instances: "2"
spark.executor.cores: "2"
spark.executor.memory: "1g"
spark.driver.cores: "1"
spark.driver.memory: "512m"
#在 Spark Conf 中也指定 SA,确保 Spark 内部客户端知道用哪个 SA
spark.kubernetes.driver.serviceAccountName: "spark-app-sa"
spark.kubernetes.executor.serviceAccountName: "spark-app-sa"
driver:
#使用 podOverrides 指定 K8s 原生的 serviceAccountName
podOverrides:
spec:
serviceAccountName: "spark-app-sa"
config:
resources:
cpu:
min: "1"
max: "2"
memory:
limit: "1Gi"
executor:
replicas: 1
podOverrides:
spec:
serviceAccountName: "spark-app-sa"
config:
resources:
cpu:
min: "1700m"
max: "3"
memory:
limit: "2Gi"
`
报错日志如下:

Possible solution
No response
Additional context
No response
Environment
No response
Would you like to work on fixing this bug?
None