-
Notifications
You must be signed in to change notification settings - Fork 906
Open
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
I have tainted an autoscaling GKE node-pool with the taint cloud.google.com/gke-nodepool=jenkins-agent-pool
, but agents with a toleration are unable to be spin up due to the taint.
If I remove the taint from the node pool, agents are spun up.
Example in values.yaml file shows how to add toleration to an agent using yamlTemplate
, which I have implemented as follows:
agent:
kubernetesConnectTimeout: 30
kubernetesReadTimeout: 30
image:
repository: jenkins/inbound-agent
tag: "alpine-jdk21"
#idleMinutes: 5
#websocket: true
alwaysPullImage: true
showRawYaml: true
resources:
requests:
cpu: 600m
memory: 2G
limits:
cpu: 940m
memory: 2.75G
podName: jenkins-agent
connectTimeout: 300
yamlTemplate: |-
apiVersion: v1
kind: Pod
spec:
tolerations:
- key: cloud.google.com/gke-nodepool
operator: Equal
value: jenkins-agent-pool
effect: NoSchedule
yamlMergeStrategy: "merge"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- jenkins-agent-pool
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
Version of Helm and Kubernetes
- Helm: version.BuildInfo{Version:"v3.13", GitCommit:"", GitTreeState:"", GoVersion:"go1.23.1"}
- Kubernetes: 1.30.5-gke.1713000
Chart version
jenkins/jenkins 5.8.2
What happened?
1. Create tainted, autoscaling agent node pool on GKE
2. Add agent toleration to values.yaml as per example
3. Run a jenkins job
4. Job logs report that Jenkins can't scale up nodes due to taint
What you expected to happen?
A node to be created for the agent.
How to reproduce it
agent:
kubernetesConnectTimeout: 30
kubernetesReadTimeout: 30
image:
repository: jenkins/inbound-agent
tag: "alpine-jdk21"
#idleMinutes: 5
#websocket: true
alwaysPullImage: true
showRawYaml: true
resources:
requests:
cpu: 600m
memory: 2G
limits:
cpu: 940m
memory: 2.75G
podName: jenkins-agent
connectTimeout: 300
yamlTemplate: |-
apiVersion: v1
kind: Pod
spec:
tolerations:
- key: cloud.google.com/gke-nodepool
operator: Equal
value: jenkins-agent-pool
effect: NoSchedule
yamlMergeStrategy: "merge"
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: cloud.google.com/gke-nodepool
operator: In
values:
- jenkins-agent-pool
annotations:
cluster-autoscaler.kubernetes.io/safe-to-evict: "true"
Anything else we need to know?
Searched kubernetes-plugin issues and found a potentially relevant issue, though it is related to the controller and not the agent Tolerations are not getting overwritten via "Raw YAML for the Pod"
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working