Skip to content

Introduce batching into worker discovery during scaling #773

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

Andyz26
Copy link
Collaborator

@Andyz26 Andyz26 commented Jun 13, 2025

Refactor worker state filtering and scheduling update gaps during scaling. This reduces scaling update storms from N individual updates to 1-3 batched updates.

  • Filter JobSchedulingInfo to only include Started
    workers, preventing downstream connection interruptions
  • Add smart refresh batching with pending worker
    detection to avoid premature flag resets
  • Implement WorkerState.isPendingState() helper for
    consistent state checking
  • Add comprehensive tests covering scaling
    scenarios and flag reset edge cases
  • Include detailed context and analysis documentation of
    connection mechanisms and scaling optimizations

… This reduces scaling update storms from N individual updates to 1-3 batched updates.

  - Filter JobSchedulingInfo to only include Started
  workers, preventing downstream connection failures
  - Add smart refresh batching with pending worker
  detection to avoid premature flag resets
  - Implement WorkerState.isPendingState() helper for
   consistent state checking
  - Add comprehensive tests covering scaling
  scenarios and flag reset edge cases
  - Include detailed context and analysis documentation of
  connection mechanisms and scaling optimizations
Copy link

github-actions bot commented Jun 13, 2025

Test Results

151 files  +1  151 suites  +1   9m 3s ⏱️ -43s
658 tests +8  647 ✅ +9  11 💤 ±0  0 ❌  - 1 
658 runs  +7  647 ✅ +8  11 💤 ±0  0 ❌  - 1 

Results for commit bc9b2f5. ± Comparison against base commit 573980e.

♻️ This comment has been updated with latest results.

Copy link

github-actions bot commented Jun 13, 2025

Uploaded Artifacts

To use these artifacts in your Gradle project, paste the following lines in your build.gradle.

resolutionStrategy {
    force "io.mantisrx:mantis-common:0.1.0-20250618.001552-580"
    force "io.mantisrx:mantis-common-serde:0.1.0-20250618.001552-580"
    force "io.mantisrx:mantis-common-akka:0.1.0-20250618.001552-16"
    force "io.mantisrx:mantis-client:0.1.0-20250618.001552-581"
    force "io.mantisrx:mantis-discovery-proto:0.1.0-20250618.001552-580"
    force "io.mantisrx:mantis-network:0.1.0-20250618.001552-580"
    force "io.mantisrx:mantis-runtime:0.1.0-20250618.001552-581"
    force "io.mantisrx:mantis-runtime-executor:0.1.0-20250618.001552-116"
    force "io.mantisrx:mantis-remote-observable:0.1.0-20250618.001552-581"
    force "io.mantisrx:mantis-jm-akka:0.1.0-20250618.001552-10"
    force "io.mantisrx:mantis-shaded:0.1.0-20250618.001552-579"
    force "io.mantisrx:mantis-rxcontrol:0.1.0-20250618.001552-54"
    force "io.mantisrx:mantis-runtime-loader:0.1.0-20250618.001552-581"
    force "io.mantisrx:mantis-runtime-autoscaler-api:0.1.0-20250618.001552-10"
    force "io.mantisrx:mantis-testcontainers:0.1.0-20250618.001552-250"
    force "io.mantisrx:mantis-connector-iceberg:0.1.0-20250618.001552-579"
    force "io.mantisrx:mantis-connector-job-source:0.1.0-20250618.001552-32"
    force "io.mantisrx:mantis-connector-publish:0.1.0-20250618.001552-580"
    force "io.mantisrx:mantis-control-plane-core:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-control-plane-dynamodb:0.1.0-20250618.001552-41"
    force "io.mantisrx:mantis-control-plane-server:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-examples-core:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-control-plane-client:0.1.0-20250618.001552-580"
    force "io.mantisrx:mantis-examples-groupby-sample:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-connector-kafka:0.1.0-20250618.001552-581"
    force "io.mantisrx:mantis-examples-mantis-publish-sample:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-examples-sine-function:0.1.0-20250618.001552-573"
    force "io.mantisrx:mantis-examples-synthetic-sourcejob:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-examples-twitter-sample:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-examples-jobconnector-sample:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-examples-wordcount:0.1.0-20250618.001552-573"
    force "io.mantisrx:mantis-publish-netty:0.1.0-20250618.001552-573"
    force "io.mantisrx:mantis-publish-core:0.1.0-20250618.001552-573"
    force "io.mantisrx:mantis-publish-netty-guice:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-source-job-publish:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-server-worker-client:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-source-job-kafka:0.1.0-20250618.001552-574"
    force "io.mantisrx:mantis-server-agent:0.1.0-20250618.001552-573"
}

@Andyz26 Andyz26 requested a deployment to Integrate Pull Request June 13, 2025 23:08 — with GitHub Actions Waiting
Copy link
Contributor

@liuml07 liuml07 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good!

@Andyz26 Andyz26 temporarily deployed to Integrate Pull Request June 13, 2025 23:58 — with GitHub Actions Inactive
@Andyz26 Andyz26 deployed to Integrate Pull Request June 18, 2025 00:14 — with GitHub Actions Active
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants