Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a way to implement a MultiKueue support for the custom Jobs #2349

Open
4 of 5 tasks
tenzen-y opened this issue Jun 3, 2024 · 32 comments · May be fixed by #2458 or #2405
Open
4 of 5 tasks

Provide a way to implement a MultiKueue support for the custom Jobs #2349

tenzen-y opened this issue Jun 3, 2024 · 32 comments · May be fixed by #2458 or #2405
Assignees
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@tenzen-y
Copy link
Member

tenzen-y commented Jun 3, 2024

This is blocked by

What would you like to be added:
I would like to support the extension mechanism so that we can implement the MultiKueue controller for the custom / in-house Jobs. I'm thinking of exposing the jobAdapter interface and implementing a mechanism to add arbitrary objects to the adapters similar to the jobframework integration manager:

adapters:

adapters = map[string]jobAdapter{
batchv1.SchemeGroupVersion.WithKind("Job").String(): &batchJobAdapter{},
jobset.SchemeGroupVersion.WithKind("JobSet").String(): &jobsetAdapter{},
}

jobAdapters interface:
type jobAdapter interface {
// Creates the Job object in the worker cluster using remote client, if not already created.
// Copy the status from the remote job if already exists.
SyncJob(ctx context.Context, localClient client.Client, remoteClient client.Client, key types.NamespacedName, workloadName, origin string) error
// Deletes the Job in the worker cluster.
DeleteRemoteObject(ctx context.Context, remoteClient client.Client, key types.NamespacedName) error
// KeepAdmissionCheckPending returns true if the state of the multikueue admission check should be
// kept Pending while the job runs in a worker. This might be needed to keep the managers job
// suspended and not start the execution locally.
KeepAdmissionCheckPending() bool
// IsJobManagedByKueue returns:
// - a bool indicating if the job object identified by key is managed by kueue and can be delegated.
// - a reason indicating why the job is not managed by Kueue
// - any API error encountered during the check
IsJobManagedByKueue(ctx context.Context, localClient client.Client, key types.NamespacedName) (bool, string, error)
}

jobframework integration manager: https://github.com/kubernetes-sigs/kueue/blob/e461fe0827786e5cd6f45ff6739ebeed9a700b05/pkg/controller/jobframework/integrationmanager.go

Why is this needed:
In many company, they have company specific CustomResources, and we should give a possibility to manage such CustomResources across multiple clusters.

Completion requirements:

This enhancement requires the following artifacts:

  • Design doc
  • API change
  • Docs update

The artifacts should be linked in subsequent comments.

@tenzen-y tenzen-y added the kind/feature Categorizes issue or PR as related to a new feature. label Jun 3, 2024
@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 3, 2024

cc: @trasc @mimowo @alculquicondor

@alculquicondor
Copy link
Contributor

+1
Is this something you want to give a first try to?

We've had requests to support kubeflow TFJobs

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 3, 2024

Is this something you want to give a first try to?

I will try this issue after I finish #2175.
So, feel free to assign this issue to anyone.

@dgrove-oss
Copy link
Contributor

We should design this to support external integrations as well.

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 3, 2024

We should design this to support external integrations as well.

Yes, we can do it. But I think starting things as a minimum required is better. So, I want to take the external integrations as a separate issue.

@trasc
Copy link
Contributor

trasc commented Jun 4, 2024

For other jobs with built-in support in kueue it is fairly easy to do.

For fully external implementation it might be a bit harder, the multikueue admission check controller is composed of 3 object controllers (reconcilers for workload, admissionchecks and multikueue clusters) and some long lived clients with watchers connected to the worker clusters. It's likely that the MulikueueCluster API needs to be modified to ensure that one multikueue cluster is only managed by one admission check controller.

For now I wold suggest to do the builtin jobs support and create a follow-up for the external usage.

@trasc
Copy link
Contributor

trasc commented Jun 4, 2024

/assign

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 4, 2024

For other jobs with built-in support in kueue it is fairly easy to do.

For fully external implementation it might be a bit harder, the multikueue admission check controller is composed of 3 object controllers (reconcilers for workload, admissionchecks and multikueue clusters) and some long lived clients with watchers connected to the worker clusters. It's likely that the MulikueueCluster API needs to be modified to ensure that one multikueue cluster is only managed by one admission check controller.

For now I wold suggest to do the builtin jobs support and create a follow-up for the external usage.

@trasc This issue motivation is supporting in-house/external jobs. So, if you work only on built-in jobs like TFJob, could you open another issue?

@trasc
Copy link
Contributor

trasc commented Jun 4, 2024

@trasc This issue motivation is supporting in-house/external jobs. So, if you work only on built-in jobs like TFJob, could you open another issue?

Done #2350

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 4, 2024

@trasc This issue motivation is supporting in-house/external jobs. So, if you work only on built-in jobs like TFJob, could you open another issue?

Done #2350

@trasc Thank you. I'm guessing that we can work on this after #2350 is done because we will implement a mechanism to support more bilt-in Jobs in #2350, right?

@tenzen-y
Copy link
Member Author

tenzen-y commented Jun 4, 2024

I synced supporting MultiKueue for the custom Jobs with @trasc.
In conclusion, we need to have 2 phases to support it in the following:

  1. Move JobAdapters to JobFramework, and then expose the interface to other Go packages.
  2. Implement a mechanism to specify AdmissionCheck ControllerName for the MultiKueue via JobFramwork.

In #2350, @trasc tries the first phase, and then he will try the second phase in this issue.

@mszadkow
Copy link
Contributor

/assign

@alculquicondor
Copy link
Contributor

Implement a mechanism to specify AdmissionCheck ControllerName for the MultiKueue via JobFramwork.

That sounds odd. External controllers shouldn't have to re-implement the admission check. They should only implement how to sync jobs.

If those two things are tightly coupled, they should be decoupled.

@trasc can you actually write a short description of the components involved in MultiKueue and put it in https://github.com/kubernetes-sigs/kueue/tree/main/keps/693-multikueue

@tenzen-y
Copy link
Member Author

@alculquicondor @trasc TBH, as an external contributor (non Googler), I'm feeling that the MultiKueue specifications is not clearing.
Because even though the current MultuKueue is significantly evolving different from the design phase, the KEP is too outdated (actually, we don't have the MultiKueueConfig resource in the implementations level): https://github.com/kubernetes-sigs/kueue/tree/main/keps/693-multikueue#design-details.

So, before we apply the something enhancements into the MultiKueue, shouldn't we update the KEP to align with the actual implementations and behavior?

After that, can we add new enhancements to the MultiKueue?

@trasc
Copy link
Contributor

trasc commented Jun 14, 2024

Hi @tenzen-y , indeed the KEP does not contain all the implementation details, but I don't think it should, and I don't see any significant deviations from the design.

To be specific MultiKueueConfig is defined here:

// MultiKueueConfig is the Schema for the multikueue API
type MultiKueueConfig struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec MultiKueueConfigSpec `json:"spec,omitempty"`
}

and used:

https://github.com/search?q=repo%3Akubernetes-sigs%2Fkueue+%22MultiKueueConfig%22+language%3AGo&type=code

<site>/docs/concepts/multikueue and <site>/docs/tasks/manage/setup_multikueue are also providing additional information about MultiKueue.

@tenzen-y
Copy link
Member Author

To be specific MultiKueueConfig is defined here:

I meant the MultiKueueCluster.

/docs/concepts/multikueue and /docs/tasks/manage/setup_multikueue are also providing additional information about MultiKueue.

That is user documentation, so I still think updating the MultiKueue proposal would be worth it since clarifying the internal mechanism would be worth it. Even if some maintainers step down from this project, this Kueue project continue to be the journey. Actually, I have some experience in the unclear internal specifications brought OSS projects bugs and challenging maintenance.

Additionally, the latest KEP (not outdated) allows us to clear the place where the rabbit hole is and could reduce the risks of generating traps by being reviewed by multiple contributors.

@mimowo
Copy link
Contributor

mimowo commented Jun 14, 2024

Personally, I think this is a distinguished feature which justifies a dedicated KEP (even though I would be acceptive of MultiKueue KEP update). I would like to see the chosen approach outlined, and indicated alternatives, along with pros and cons. Additionally, the KEP process creates space for early feedback from users who are not code reviewers.

My ask is motivated by difficulty understanding design choices taken in the pending PRs (#2373 and #2405). Getting the decisions discussed and agreed early on could avoid the confusion, and potentially dropping a significant amount of code in case we decide to change the approach.

@trasc
Copy link
Contributor

trasc commented Jun 14, 2024

I meant the MultiKueueCluster.

type MultiKueueClusterSpec struct {
// Information how to connect to the cluster.
KubeConfig KubeConfig `json:"kubeConfig"`
}
type MultiKueueClusterStatus struct {
// +optional
// +listType=map
// +listMapKey=type
// +patchStrategy=merge
// +patchMergeKey=type
Conditions []metav1.Condition `json:"conditions,omitempty" patchStrategy:"merge" patchMergeKey:"type"`
}
// +genclient
// +genclient:nonNamespaced
// +kubebuilder:object:root=true
// +kubebuilder:storageversion
// +kubebuilder:subresource:status
// +kubebuilder:resource:scope=Cluster
// MultiKueueCluster is the Schema for the multikueue API
type MultiKueueCluster struct {
metav1.TypeMeta `json:",inline"`
metav1.ObjectMeta `json:"metadata,omitempty"`
Spec MultiKueueClusterSpec `json:"spec,omitempty"`
Status MultiKueueClusterStatus `json:"status,omitempty"`
}

@trasc
Copy link
Contributor

trasc commented Jun 14, 2024

Additionally, the latest KEP (not outdated) allows us to clear the place where the rabbit hole is and could reduce the risks of generating traps by being reviewed by multiple contributors.

The KEP cannot cover all the implementation details, The details in all the KEPs are limited, take the provisioning admission check controller , a very similar component, there is no mentioning of it's internal components.

@alculquicondor
Copy link
Contributor

I synced with @trasc offline.

We agreed that there are quite a few mini-components in MultiKueue that it is a bit hard to keep track of as an outsider. Traian will update the KEP with those design elements, spending about 1 paragraph for each mini-component.

I agree that most features don't require such level of detail, but the complexity in MultiKueue is a bit higher.

@trasc
Copy link
Contributor

trasc commented Jun 18, 2024

#2436 opened, but on hold pending the merge of #2373

@alculquicondor
Copy link
Contributor

#2373 looks like a simple refactoring overall to me. I'm ok merging it and have the documentation match those changes.

@trasc trasc removed their assignment Jun 19, 2024
@tenzen-y
Copy link
Member Author

Additionally, the latest KEP (not outdated) allows us to clear the place where the rabbit hole is and could reduce the risks of generating traps by being reviewed by multiple contributors.

The KEP cannot cover all the implementation details, The details in all the KEPs are limited, take the provisioning admission check controller , a very similar component, there is no mentioning of it's internal components.

Actually, I synced with @trasc regarding to MultiKueue previous KEP, In conclusion, I agree with the current proposal, basically.

@mszadkow mszadkow linked a pull request Jun 20, 2024 that will close this issue
@mimowo
Copy link
Contributor

mimowo commented Jun 20, 2024

Actually, I synced with @trasc regarding to MultiKueue previous KEP, In conclusion, I agree with the current proposal, basically.

I'm sorry in advance if I ask stupid questions, but I still have some difficulty understanding the approach which I see continues to be implemented in #2458.

Specifically:

  • is it going to be a single AdmissionCheck per Job? In that case how a user can configure a single ClusterQueue to run multiple types of built-in jobs?
  • will the new AdmissionChecks also run the built-in jobs, or the admins are expected to configure different admission checks for custom and built-in jobs?
  • do we really need the new zoo of MultiKueue Admission checks?

Can we at least consider a design where we still have a single MultiKueue admission check, but the custom jobs adapters are registered to it. I think it will make life on OnCall easier to have a single MutliKueue admission check that can be quickly identified by its name rather than a zoo of AdmissionChecks. It would also make admin users life easier to have a global configuration for the MK AC plugins, rather than doing this per ClusterQueue.

@trasc
Copy link
Contributor

trasc commented Jun 20, 2024

We are preparing a KEP for that.

@mimowo
Copy link
Contributor

mimowo commented Jun 20, 2024

Ok great!

@tenzen-y
Copy link
Member Author

We are preparing a KEP for that.

That sounds great to me. I will review the submitted design soon.
Thank you.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 26, 2024
@tenzen-y
Copy link
Member Author

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 26, 2024
@tenzen-y
Copy link
Member Author

@mszadkow Is there any progressing on your KEP?

@mimowo
Copy link
Contributor

mimowo commented Sep 26, 2024

We will probably postpone the KEP effort until after MK goes beta

@tenzen-y
Copy link
Member Author

We will probably postpone the KEP effort until after MK goes beta

That makes sense. The KEP could move forward during the MK beta graduation if @mszadkow has time.
But, we can prioritize the MK beta beta graduation rather than this feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
8 participants