Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rush] add support for sharding phases #4652

Open
wants to merge 53 commits into
base: main
Choose a base branch
from

Conversation

aramissennyeydd
Copy link
Contributor

@aramissennyeydd aramissennyeydd commented Apr 16, 2024

Summary

This PR adds support for sharding to rush phases. This allows plugins that support sharding, like jest, to be split into multiple shards and run independently. It does this by adding a new set of options to rush-project.json under a new sharding key. example:

{
    "operationSettings": {
        {
            "operationName": "_phase:test",
            "outputFolderNames": ["coverage", "temp/coverage"],
            "sharding": {
                "count": 6,
                // Defaults to `--shard={shardIndex}/{shardCount}`
                "shardArgumentFormat": "--shard-format={shardIndex}-{shardCount}"
            }
        }
    }
}

Details

This is the initial chunk of work to support sharding in the operation graph. It includes both the sharding nodes as well as a collator node that can run a script after all of the shard nodes are complete.

I originally attempted this with heft plugins, however tying into rush parallelism + cobuilds is one of our end goals with this work and heft plugins at the moment don't allow that.

How it was tested

I've been locally testing with node apps/rush/lib/start-dev test --to heft-jest-shards-test -p 6 and varying parallelism flags. The tests in heft-jest-shards-test run for 10 seconds and then pass and there are 6 of those files, so -p 6 should run in ~10 seconds, -p 3 in 20 seconds and so on. I also added a sharded-repo to the existing cobuild suite to ensure this works with cobuilds.

TODO:

  • retest cobuilds after the log filename is updated.

Impacted documentation

@aramissennyeydd
Copy link
Contributor Author

Also, I think CI is failing due to mismatched versions of rush - the sharding option isn't available in the version of install-run-rush.

@aramissennyeydd aramissennyeydd marked this pull request as ready for review April 29, 2024 14:50
@aramissennyeydd
Copy link
Contributor Author

@dmichon-msft I trimmed out the heft changes - this should be good for another 👀

@octogonz
Copy link
Collaborator

@iclanton @dmichon-msft Are we ready to merge this?

@octogonz octogonz changed the title [rush-lib]: add support for sharding phases [rush-lib] add support for sharding phases May 14, 2024
@octogonz octogonz changed the title [rush-lib] add support for sharding phases [rush] add support for sharding phases May 14, 2024
Copy link
Contributor

@dmichon-msft dmichon-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple nits for performance, but otherwise looks about good to go.

common/reviews/api/rush-lib.api.md Outdated Show resolved Hide resolved
@aramissennyeydd
Copy link
Contributor Author

The conversations that I've left unresolved above I have open questions around.

Current state of things,

  1. Shards are spliced into the current operation graph, with a pre-shard node that does nothing, a set of N shard nodes that do the sharding work (_phase:${name}:shard) and a single collate node that runs work over multiple shards (_phase:${name})
  2. Both shard + collate operations use the overall phase missingScriptBehavior configuration.
  3. Operation weighting works with sharding as well, in the below timeline b shards have weight=10 and a shards have weight 4. Parallelism is set to 10. As expected, only 1 b build is picked up at a time, and 3 a builds are picked up at once.
b (build) - shard 15/15 ###----------------------------------------------------------------------------- 2.1s
b (build) - shard 14/15 --####-------------------------------------------------------------------------- 2.1s
b (build) - shard 13/15 -----###------------------------------------------------------------------------ 2.1s
b (build) - shard 12/15 -------####--------------------------------------------------------------------- 2.1s
b (build) - shard 11/15 ----------####------------------------------------------------------------------ 2.1s
b (build) - shard 10/15 -------------###---------------------------------------------------------------- 2.1s
 b (build) - shard 9/15 ---------------####------------------------------------------------------------- 2.2s
 b (build) - shard 8/15 ------------------####---------------------------------------------------------- 2.2s
 b (build) - shard 7/15 ---------------------###-------------------------------------------------------- 2.2s
 b (build) - shard 6/15 -----------------------####----------------------------------------------------- 2.2s
 b (build) - shard 5/15 --------------------------####-------------------------------------------------- 2.2s
 b (build) - shard 4/15 -----------------------------###------------------------------------------------ 2.2s
 b (build) - shard 3/15 -------------------------------####--------------------------------------------- 2.2s
 b (build) - shard 2/15 ----------------------------------####------------------------------------------ 2.2s
 b (build) - shard 1/15 -------------------------------------###---------------------------------------- 2.1s
    b (build) - collate ---------------------------------------##--------------------------------------- 0.7s
  a (build) - shard 3/3 ---------------------------------------####------------------------------------- 2.2s
  a (build) - shard 2/3 ---------------------------------------####------------------------------------- 2.2s
  a (build) - shard 1/3 ---------------------------------------####------------------------------------- 2.1s
  1. I adjusted the collate script to also use CLI parameters, --shard-parent-folder and --shard-count.
  2. I also verified that the new build + collate scripts work as expected, example collate output below,
Hello world! b --shard=1/15 --output-directory=.rush/operations/_phase_build/shards/1
Hello world! b --shard=2/15 --output-directory=.rush/operations/_phase_build/shards/2
Hello world! b --shard=3/15 --output-directory=.rush/operations/_phase_build/shards/3
Hello world! b --shard=4/15 --output-directory=.rush/operations/_phase_build/shards/4
Hello world! b --shard=5/15 --output-directory=.rush/operations/_phase_build/shards/5
Hello world! b --shard=6/15 --output-directory=.rush/operations/_phase_build/shards/6
Hello world! b --shard=7/15 --output-directory=.rush/operations/_phase_build/shards/7
Hello world! b --shard=8/15 --output-directory=.rush/operations/_phase_build/shards/8
Hello world! b --shard=9/15 --output-directory=.rush/operations/_phase_build/shards/9
Hello world! b --shard=10/15 --output-directory=.rush/operations/_phase_build/shards/10
Hello world! b --shard=11/15 --output-directory=.rush/operations/_phase_build/shards/11
Hello world! b --shard=12/15 --output-directory=.rush/operations/_phase_build/shards/12
Hello world! b --shard=13/15 --output-directory=.rush/operations/_phase_build/shards/13
Hello world! b --shard=14/15 --output-directory=.rush/operations/_phase_build/shards/14
Hello world! b --shard=15/15 --output-directory=.rush/operations/_phase_build/shards/15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

None yet

4 participants