how to restore cortex operator normally when too many jobs are requested

hello.

I'm currently using cortex 0.40.0. 

I seldom request thousands of jobs to certain cortex api by mistake.
When I do like that, I can't use cortex cli well (the response time is so long, or just hanging) and I guess that cortex operator is overloaded because of me.
(the status of `operator-controller-manager` pod is continuously goes to OOMKilled -> CrashLoopBackOff)

To resolve this issue, I attempted these so far but It didn't work well. 
1. delete thousands of AWS sqs queue
2. delete all of enqueuer job and worker job created by mistake
3. delete certain cortex api and re-deploy it

After all I just down the cluster and up (+ re-deploy all of api) to make cortex work well.
If this is happened, what should I do to restore cortex without down and up cluster?

I glad to your support. Thank you so much.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

how to restore cortex operator normally when too many jobs are requested #2394

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

how to restore cortex operator normally when too many jobs are requested #2394

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions