feat: Support customizable deployment strategy for RawDeployment mode. Fixes #3452 #3603

terrytangyuan · 2024-04-15T15:10:03Z

Fixes #3452. Supersedes #3479

spolti · 2024-04-15T15:25:20Z

Maybe we could migrate the formatting changes to another PR?

pkg/apis/serving/v1beta1/component.go

terrytangyuan · 2024-04-15T15:29:23Z

Maybe we could migrate the formatting changes to another PR?

The changes that get reformatted are introduced in this PR

pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go

spolti · 2024-04-15T15:34:37Z

Maybe we could migrate the formatting changes to another PR?

The changes that get reformatted are introduced in this PR

even the jupyter notebook ones?

terrytangyuan · 2024-04-15T16:10:00Z

Maybe we could migrate the formatting changes to another PR?

The changes that get reformatted are introduced in this PR

even the jupyter notebook ones?

Oh didn't notice those. I'll remove them.

terrytangyuan · 2024-04-15T17:51:49Z

@spolti Actually these notebook changes will have to be part of this PR to pass the CI build.

pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler.go

spolti · 2024-04-16T17:57:55Z

/lgtm

Signed-off-by: Yuan Tang <[email protected]>

spolti · 2024-04-22T15:15:31Z

/lgtm

Signed-off-by: Yuan Tang <[email protected]>

yuzisun · 2024-04-30T23:21:17Z

pkg/apis/serving/v1beta1/component.go

+
+ // The deployment strategy to use to replace existing pods with new ones.
+ // +optional
+ DeploymentStrategy *appsv1.DeploymentStrategy `json:"deploymentStrategy,omitempty"`


What happens when this field is set for the Knative mode?

This is only used in NewDeploymentReconciler.createRawDeployment so it won't affect knative mode.

the field is still exposed on the isvc which user can set for the knative mode, but I guess knative probably validate this.

I think we need to add a validation that this field is not supported for knative otherwise user will expect this to work but knative performs blue/green deployment.

pkg/apis/serving/v1beta1/component.go

Signed-off-by: Yuan Tang <[email protected]>

docs/samples/client/kfserving_sdk_v1beta1_sample.ipynb

Signed-off-by: Yuan Tang <[email protected]>

bmopuri · 2024-05-05T03:26:28Z

@terrytangyuan Thank you for making these changes. I see these deployment strategy targets inference services in raw deployment mode. Starting with kserve v12 we support raw deployment for inference graphs as well.

Could you please extend this change to inference graph raw deployment as well?

Signed-off-by: Yuan Tang <[email protected]>

terrytangyuan · 2024-05-07T00:51:38Z

Could you please extend this change to inference graph raw deployment as well?

I am trying to narrow down the scope of this PR. Extending this to IG would be a separate PR.

Signed-off-by: Yuan Tang <[email protected]>

yuzisun · 2024-05-09T13:22:56Z

/lgtm
/approve

oss-prow-bot · 2024-05-09T13:23:05Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: spolti, terrytangyuan, yuzisun

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [yuzisun]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…Fixes kserve#3452 (kserve#3603) * feat: Support customizable deployment strategy for RawDeployment mode Signed-off-by: Yuan Tang <[email protected]> * regen Signed-off-by: Yuan Tang <[email protected]> * lint Signed-off-by: Yuan Tang <[email protected]> * Correctly apply rollingupdate Signed-off-by: Yuan Tang <[email protected]> * address comments Signed-off-by: Yuan Tang <[email protected]> * Add validation Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]>

[RHOAIENG-3375][Cherry-pick] feat: Support customizable deployment strategy for RawDeployment mode. Fixes kserve#3452 (kserve#3603)

…Fixes kserve#3452 (kserve#3603) * feat: Support customizable deployment strategy for RawDeployment mode Signed-off-by: Yuan Tang <[email protected]> * regen Signed-off-by: Yuan Tang <[email protected]> * lint Signed-off-by: Yuan Tang <[email protected]> * Correctly apply rollingupdate Signed-off-by: Yuan Tang <[email protected]> * address comments Signed-off-by: Yuan Tang <[email protected]> * Add validation Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: asd981256 <[email protected]>

* upgrade vllm/transformers version (#3671) upgrade vllm version Signed-off-by: Johnu George <[email protected]> * Add openai models endpoint (#3666) Signed-off-by: Curtis Maddalozzo <[email protected]> * feat: Support customizable deployment strategy for RawDeployment mode. Fixes #3452 (#3603) * feat: Support customizable deployment strategy for RawDeployment mode Signed-off-by: Yuan Tang <[email protected]> * regen Signed-off-by: Yuan Tang <[email protected]> * lint Signed-off-by: Yuan Tang <[email protected]> * Correctly apply rollingupdate Signed-off-by: Yuan Tang <[email protected]> * address comments Signed-off-by: Yuan Tang <[email protected]> * Add validation Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]> * Enable dtype support for huggingface server (#3613) * Enable dtype for huggingface server Signed-off-by: Dattu Sharma <[email protected]> * Set float16 as default. Fixup linter Signed-off-by: Dattu Sharma <[email protected]> * Add small comment to make the changes understandable Signed-off-by: Dattu Sharma <[email protected]> * Fixup linter Signed-off-by: Dattu Sharma <[email protected]> * Adapt to new huggingfacemodel Signed-off-by: Dattu Sharma <[email protected]> * Fixup merge :) Signed-off-by: Dattu Sharma <[email protected]> * Explicitly mention the behaviour of dtype flag on auto. Signed-off-by: Dattu Sharma <[email protected]> * Default to FP32 for encoder models Signed-off-by: Dattu Sharma <[email protected]> * Selectively add --dtype to parser. Use FP16 for GPU and FP32 for CPU Signed-off-by: Dattu Sharma <[email protected]> * Fixup linter Signed-off-by: Dattu Sharma <[email protected]> * Update poetry Signed-off-by: Dattu Sharma <[email protected]> * Use torch.float32 forr tests explicitly Signed-off-by: Dattu Sharma <[email protected]> --------- Signed-off-by: Dattu Sharma <[email protected]> * Add method for checking model health/readiness (#3673) Signed-off-by: Curtis Maddalozzo <[email protected]> * fix for extract zip from gcs (#3510) * fix for extract zip from gcs Signed-off-by: Andrews Arokiam <[email protected]> * initial commit for gcs model download unittests Signed-off-by: Andrews Arokiam <[email protected]> * unittests for model download from gcs Signed-off-by: Andrews Arokiam <[email protected]> * black format fix Signed-off-by: Andrews Arokiam <[email protected]> * code verification Signed-off-by: Andrews Arokiam <[email protected]> --------- Signed-off-by: Andrews Arokiam <[email protected]> * Update Dockerfile and Readme (#3676) Signed-off-by: Gavrish Prabhu <[email protected]> * Update huggingface readme (#3678) * update wording for huggingface README small update to make readme easier to understand Signed-off-by: Alexa Griffith <[email protected]> * Update README.md Signed-off-by: Alexa Griffith [email protected] * Update python/huggingfaceserver/README.md Co-authored-by: Filippe Spolti <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> * update vllm Signed-off-by: alexagriffith <[email protected]> * Update README.md --------- Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: Alexa Griffith [email protected] Signed-off-by: alexagriffith <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Filippe Spolti <[email protected]> Co-authored-by: Dan Sun <[email protected]> * fix: HPA equality check should include annotations (#3650) * fix: HPA equality check should include annotations Signed-off-by: Yuan Tang <[email protected]> * Only watch related autoscalerclass annotation Signed-off-by: Yuan Tang <[email protected]> * simplify Signed-off-by: Yuan Tang <[email protected]> * Add missing delete action Signed-off-by: Yuan Tang <[email protected]> * fix logic Signed-off-by: Yuan Tang <[email protected]> --------- Signed-off-by: Yuan Tang <[email protected]> * Fix: huggingface runtime in helm chart (#3679) fix huggingface runtime in chart Signed-off-by: Dan Sun <[email protected]> * Fix: model id and model dir check order (#3680) * fix huggingface runtime in chart Signed-off-by: Dan Sun <[email protected]> * Allow model_dir to be specified on template Signed-off-by: Dan Sun <[email protected]> * Default model_dir to /mnt/models for HF Signed-off-by: Dan Sun <[email protected]> * Lint format Signed-off-by: Dan Sun <[email protected]> --------- Signed-off-by: Dan Sun <[email protected]> * Fix:vLLM Model Supported check throwing circular dependency (#3688) * Fix:vLLM Model Supported check throwing circular dependency Signed-off-by: Gavrish Prabhu <[email protected]> * remove unwanted comments Signed-off-by: Gavrish Prabhu <[email protected]> * remove unwanted comments Signed-off-by: Gavrish Prabhu <[email protected]> * fix return case Signed-off-by: Gavrish Prabhu <[email protected]> * fix to check all arch in model config forr vllm support Signed-off-by: Gavrish Prabhu <[email protected]> * fixlint Signed-off-by: Gavrish Prabhu <[email protected]> --------- Signed-off-by: Gavrish Prabhu <[email protected]> * Fix: Allow null in Finish reason streaming response in vLLM (#3684) Fix: allow null in Finish reason Signed-off-by: Gavrish Prabhu <[email protected]> --------- Signed-off-by: Johnu George <[email protected]> Signed-off-by: Curtis Maddalozzo <[email protected]> Signed-off-by: Yuan Tang <[email protected]> Signed-off-by: Dattu Sharma <[email protected]> Signed-off-by: Andrews Arokiam <[email protected]> Signed-off-by: Gavrish Prabhu <[email protected]> Signed-off-by: Alexa Griffith <[email protected]> Signed-off-by: Alexa Griffith [email protected] Signed-off-by: alexagriffith <[email protected]> Signed-off-by: Dan Sun <[email protected]> Co-authored-by: Curtis Maddalozzo <[email protected]> Co-authored-by: Yuan Tang <[email protected]> Co-authored-by: Datta Nimmaturi <[email protected]> Co-authored-by: Andrews Arokiam <[email protected]> Co-authored-by: Gavrish Prabhu <[email protected]> Co-authored-by: Alexa Griffith <[email protected]> Co-authored-by: Filippe Spolti <[email protected]> Co-authored-by: Dan Sun <[email protected]>

oss-prow-bot bot requested review from alexagriffith and rachitchauhan43 April 15, 2024 15:10

terrytangyuan mentioned this pull request Apr 15, 2024

feat: Support customizable deployment spec for RawDeployment mode #3479

Closed

spolti reviewed Apr 15, 2024

View reviewed changes

pkg/apis/serving/v1beta1/component.go Show resolved Hide resolved

spolti reviewed Apr 15, 2024

View reviewed changes

pkg/controller/v1beta1/inferenceservice/rawkube_controller_test.go Show resolved Hide resolved

spolti approved these changes Apr 16, 2024

View reviewed changes

Jooho reviewed Apr 16, 2024

View reviewed changes

pkg/controller/v1beta1/inferenceservice/reconcilers/deployment/deployment_reconciler.go Show resolved Hide resolved

oss-prow-bot bot assigned spolti Apr 16, 2024

oss-prow-bot bot added the lgtm label Apr 16, 2024

terrytangyuan added 4 commits April 22, 2024 11:06

feat: Support customizable deployment strategy for RawDeployment mode

6bae881

Signed-off-by: Yuan Tang <[email protected]>

regen

41d85f0

Signed-off-by: Yuan Tang <[email protected]>

lint

bbd566c

Signed-off-by: Yuan Tang <[email protected]>

Correctly apply rollingupdate

ef42393

Signed-off-by: Yuan Tang <[email protected]>

terrytangyuan force-pushed the customize-deploy-strategy branch from 6171f02 to ef42393 Compare April 22, 2024 15:07

oss-prow-bot bot removed the lgtm label Apr 22, 2024

oss-prow-bot bot added the lgtm label Apr 22, 2024

Empty-Commit

fb272c7

Signed-off-by: Yuan Tang <[email protected]>

oss-prow-bot bot removed the lgtm label Apr 22, 2024

terrytangyuan added the lgtm label Apr 22, 2024

yuzisun reviewed Apr 30, 2024

View reviewed changes

pkg/apis/serving/v1beta1/component.go Outdated Show resolved Hide resolved

address comments

b9813c2

Signed-off-by: Yuan Tang <[email protected]>

oss-prow-bot bot removed the lgtm label May 3, 2024

yuzisun reviewed May 3, 2024

View reviewed changes

docs/samples/client/kfserving_sdk_v1beta1_sample.ipynb Show resolved Hide resolved

terrytangyuan added 2 commits May 2, 2024 23:36

Empty-Commit

46dcd72

Signed-off-by: Yuan Tang <[email protected]>

Empty-Commit

fc440b5

Signed-off-by: Yuan Tang <[email protected]>

Add validation

62845a1

Signed-off-by: Yuan Tang <[email protected]>

Empty-Commit

e771199

Signed-off-by: Yuan Tang <[email protected]>

oss-prow-bot bot assigned yuzisun May 9, 2024

oss-prow-bot bot added the lgtm label May 9, 2024

oss-prow-bot bot added the approved label May 9, 2024

yuzisun merged commit 629e4ae into kserve:master May 9, 2024
57 of 58 checks passed

terrytangyuan deleted the customize-deploy-strategy branch May 9, 2024 14:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Support customizable deployment strategy for RawDeployment mode. Fixes #3452 #3603

feat: Support customizable deployment strategy for RawDeployment mode. Fixes #3452 #3603

terrytangyuan commented Apr 15, 2024

spolti commented Apr 15, 2024

terrytangyuan commented Apr 15, 2024

spolti commented Apr 15, 2024

terrytangyuan commented Apr 15, 2024

terrytangyuan commented Apr 15, 2024

spolti commented Apr 16, 2024

spolti commented Apr 22, 2024

yuzisun Apr 30, 2024

terrytangyuan May 3, 2024 •

edited

yuzisun May 3, 2024

yuzisun May 4, 2024 •

edited

terrytangyuan May 7, 2024

bmopuri commented May 5, 2024

terrytangyuan commented May 7, 2024

yuzisun commented May 9, 2024

oss-prow-bot bot commented May 9, 2024

feat: Support customizable deployment strategy for RawDeployment mode. Fixes #3452 #3603

feat: Support customizable deployment strategy for RawDeployment mode. Fixes #3452 #3603

Conversation

terrytangyuan commented Apr 15, 2024

spolti commented Apr 15, 2024

terrytangyuan commented Apr 15, 2024

spolti commented Apr 15, 2024

terrytangyuan commented Apr 15, 2024

terrytangyuan commented Apr 15, 2024

spolti commented Apr 16, 2024

spolti commented Apr 22, 2024

yuzisun Apr 30, 2024

Choose a reason for hiding this comment

terrytangyuan May 3, 2024 • edited

Choose a reason for hiding this comment

yuzisun May 3, 2024

Choose a reason for hiding this comment

yuzisun May 4, 2024 • edited

Choose a reason for hiding this comment

terrytangyuan May 7, 2024

Choose a reason for hiding this comment

bmopuri commented May 5, 2024

terrytangyuan commented May 7, 2024

yuzisun commented May 9, 2024

oss-prow-bot bot commented May 9, 2024

terrytangyuan May 3, 2024 •

edited

yuzisun May 4, 2024 •

edited