-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use replicas when initializing pg minResources #4000
base: master
Are you sure you want to change the base?
use replicas when initializing pg minResources #4000
Conversation
/assign @hzxuzhonghu |
Same situation as mentioned in this issue: |
proportion plugin controls the queue related capabilities, if you disable it, the capacity check will not take effect. |
Just as @hwdef said, please set an annotation to control the minmember to set, and also other workload type like statefulset should also has this capability: ) |
hi @hwdef @Monokaix thanks for all your suggestions, I have updated the feature using new annotation, so it won't break current behaviors. Please have a look. |
|
||
for _, reference := range pod.OwnerReferences { | ||
if reference.Kind != "" && reference.Name != "" { | ||
tmp := pg.getAnnotationsFromUpperRes(reference.Kind, reference.Name, pod.Namespace) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Usually we write annotation on the deployment to specify minMember, but here the ownerReferences of the Pod can only be traced back to the replicaset.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From k8s source code https://github.com/kubernetes/kubernetes/blob/master/pkg/controller/deployment/util/deployment_util.go#L235, the ReplicaSet will inherit annotations from Deployment, so I think it's enough to get annotations from RS
for _, reference := range pod.OwnerReferences { | ||
if reference.Kind != "" && reference.Name != "" { | ||
tmp := pg.getAnnotationsFromUpperRes(reference.Kind, reference.Name, pod.Namespace) | ||
if err := mergo.Merge(&annotations, &tmp); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have scenarios that Pod has multiple OwnerReferences? If there are trully multiple annotations, we won't specify twice the minMember annotations, do we need to use the merge.Merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally there won't be multiple OwnerReferences with this annotation, updated. Thanks
@@ -177,6 +181,37 @@ func (pg *pgcontroller) getAnnotationsFromUpperRes(kind string, name string, nam | |||
} | |||
} | |||
|
|||
func (pg *pgcontroller) getMinMemberFromUpperRes(pod *v1.Pod) *int32 { | |||
defaultMinMember := int32(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This variable name is very strange. You can set 1 as a constant (as defaultMinMember), and then the variable here is initialized to defaultMinMember. minMember
here is ok
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
@@ -193,7 +228,7 @@ func (pg *pgcontroller) inheritUpperAnnotations(pod *v1.Pod, obj *scheduling.Pod | |||
} | |||
} | |||
|
|||
func (pg *pgcontroller) createNormalPodPGIfNotExist(pod *v1.Pod) error { | |||
func (pg *pgcontroller) createNormalPodPGIfNotExist(pod *v1.Pod, minAvailable *int32) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are the variable names here not unified as minMember?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated
@@ -203,6 +238,11 @@ func (pg *pgcontroller) createNormalPodPGIfNotExist(pod *v1.Pod) error { | |||
return err | |||
} | |||
|
|||
minMember := int32(1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to initialize again here, you already have the initialization value in getMinMemberFromUpperRes
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated
@@ -327,10 +327,10 @@ func (p TasksPriority) CalcFirstCountResources(count int32) v1.ResourceList { | |||
|
|||
for _, task := range p { | |||
if count <= task.Replicas { | |||
minReq = quotav1.Add(minReq, calTaskRequests(&v1.Pod{Spec: task.Template.Spec}, count)) | |||
minReq = quotav1.Add(minReq, *util.CalTaskRequests(&v1.Pod{Spec: task.Template.Spec}, count)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return value of the function here is a pointer, and then you use *
directly to get the value pointed to by the pointer, the way it's written looks a bit strange and doesn't quite fit in with go's clean code. The return value of the func CalTaskRequests
does not need to be refactored, keeping the original is better. In fact, when the API was designed, minResources should not be a pointer, ResourceList is a map.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @JesseStutler Thanks for your reviewing! Have updated, please have a look |
API related pr is merged, you can move on: ) |
4241911
to
05dd3e0
Compare
Updated go.mod, please have a look @Monokaix @hwdef @JesseStutler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/ok-to-test |
"k8s.io/kubernetes/pkg/apis/core/v1/helper" | ||
quotacore "k8s.io/kubernetes/pkg/quota/v1/evaluator/core" | ||
"k8s.io/utils/clock" | ||
) | ||
|
||
func GetPodQuotaUsage(pod *v1.Pod) *v1.ResourceList { | ||
func GetPodQuotaUsage(pod *v1.Pod) v1.ResourceList { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why changed to non-poiner?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just follow this comment from @JesseStutler for better style #4000 (comment)
@@ -172,7 +172,8 @@ func (pg *pgcontroller) processNextReq() bool { | |||
|
|||
// normal pod use volcano | |||
klog.V(4).Infof("Try to create podgroup for pod %s/%s", pod.Namespace, pod.Name) | |||
if err := pg.createNormalPodPGIfNotExist(pod); err != nil { | |||
minMember := pg.getMinMemberFromUpperRes(pod) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic should put in createNormalPodPGIfNotExist
, because when the podgroup is already existed, there is no need to calculate the min member, it will request APIServer which is a heavy operation.
And there is also a qustion mentioned here #3970 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
very good point 👍 so it's better to keep rs/sts etc informers in pgcontroller and read cache from these informers, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, DaemonSet and Job should also be considered.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this should also be considered #3970 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also this should also be considered #3970 (comment)
I think it's not necessary to check pod annotation? The annotation is designed for workload, it will be strange to see this in pod
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
2495285
to
27e95ed
Compare
Refactored code using informers, please have a look again @hwdef @Monokaix @JesseStutler |
Hi, please make the CI happy |
7c46da3
to
2922e6d
Compare
Signed-off-by: sceneryback <[email protected]>
ad05f15
to
be21116
Compare
hi @hwdef fixed the failed e2e tests, please have a look again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Thanks!
@@ -46,7 +46,7 @@ require ( | |||
sigs.k8s.io/controller-runtime v0.13.0 | |||
sigs.k8s.io/yaml v1.4.0 | |||
stathat.com/c/consistent v1.0.0 | |||
volcano.sh/apis v1.10.0-alpha.0.0.20241210014034-bf27f4e986d0 | |||
volcano.sh/apis v1.11.1-0.20250211082520-7f8222e881d9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 1.11.1?
@@ -99,15 +118,29 @@ func (pg *pgcontroller) Initialize(opt *framework.ControllerOption) error { | |||
AddFunc: pg.addPod, | |||
}) | |||
|
|||
pg.rsInformer = pg.informerFactory.Apps().V1().ReplicaSets() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be under statement if utilfeature.DefaultFeatureGate.Enabled(features.WorkLoadSupport) {
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rsLister is also used in 'inheritUpperAnnotations()' when WorkLoadSupport is not enabled, so only add these codes under the statement:
pg.rsInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs{
AddFunc: pg.addReplicaSet,
UpdateFunc: pg.updateReplicaSet,
})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WorkLoadSupport feature means whether Volcano supports deployment/rs/ds/sts,etc., so the informers should be initiated when featuregate is not true.
@@ -99,15 +118,29 @@ func (pg *pgcontroller) Initialize(opt *framework.ControllerOption) error { | |||
AddFunc: pg.addPod, | |||
}) | |||
|
|||
pg.rsInformer = pg.informerFactory.Apps().V1().ReplicaSets() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should add a param feature switch to users whether enable non-vcjob supports minMember, becasue it's a little heavy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @Monokaix you mean a feature gate like features.NonVcjobMinMemberSupport which only applies logic in this PR when enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a param flag is ok.
klog.Infof("Failed to convert minMemberAnnotation of Pod owners <%s/%s> into number, skipping", | ||
pod.Namespace, pod.Name) | ||
return minMember | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
< 0 is ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should be positive, will fix
if minMemberAnno, ok := annotations[scheduling.VolcanoGroupMinMemberAnnotationKey]; ok { | ||
minMemberFromAnno, err := strconv.Atoi(minMemberAnno) | ||
if err != nil { | ||
klog.Infof("Failed to convert minMemberAnnotation of Pod owners <%s/%s> into number, skipping", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This klog info is not so clear, we should mention that remain minmember =1, or users will don't know the value finally set.
return minMember | ||
} | ||
if minMemberFromAnno < math.MinInt32 || minMemberFromAnno > math.MaxInt32 { | ||
klog.Infof("minMember %d exceeds bounds of int32, skipping", minMember) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same above.
@@ -99,15 +118,29 @@ func (pg *pgcontroller) Initialize(opt *framework.ControllerOption) error { | |||
AddFunc: pg.addPod, | |||
}) | |||
|
|||
pg.rsInformer = pg.informerFactory.Apps().V1().ReplicaSets() | |||
pg.rsLister = pg.rsInformer.Lister() | |||
pg.rsSynced = pg.rsInformer.Informer().HasSynced |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is no need to add these rsSynced funcs, these codes:
for informerType, ok := range pg.informerFactory.WaitForCacheSync(stopCh) {
if !ok {
klog.Errorf("caches failed to sync: %v", informerType)
}
}
for informerType, ok := range pg.vcInformerFactory.WaitForCacheSync(stopCh) {
if !ok {
klog.Errorf("caches failed to sync: %v", informerType)
return
}
}
has already done this
@@ -74,13 +74,13 @@ rules: | |||
verbs: ["get", "create", "delete"] | |||
- apiGroups: ["apps"] | |||
resources: ["daemonsets", "statefulsets"] | |||
verbs: ["get"] | |||
verbs: ["get", "list"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we change all these APIs to list/watch them, I think the verbs here should be ["list","watch"]?
Better to post a screenshot to verify your codes work in actual environment thanks |
New changes are detected. LGTM label has been removed. |
When creating pod using replicaSet, the minMember of podgroup created is always 1, this is not reasonable. Consider this case:
As can be seen from this picture:

In the case that pod has replicas from its owners, the podgroup should consider relicas, just as training-operator does, which creates pod group based on resources from all replicas