GHA pipeline rewrite for ease and speed #1551

gpmayorga · 2023-09-15T03:46:58Z

Description

Main goal was to activate an efficient cache system to improve build time of the chain nodes and wasm builds. But since every single workflow file needed touching I started consolidating code and adding other improvements resulting in an almost full re-write of our pipelines.

Ref (internal) document: https://centrifuge.hackmd.io/Ftgot4ilSxK5ERyrEujnBg?view

Changes and Descriptions

Cache features

This PR introduces 3 types of cache systems:

Cross-PR and branch cache setup with a GCS bucket through "smart" Mozilla's sccache. Any workflow using Rust can take advantage of this by using actions/prep-ubuntu with some parameters
Used by: sanity checks, benchmarks, linters
srtool WASM build cache
Using the same cache system as our current pipelines with some enhancements (Swatinem/rust-cache). Getsmounted on the srtool container.
Used by: wasm runtime builds
Docker layer caching with docker buildx
Using buildx cache system, setup to use the registry as a cache but can be setup with GHA native cache system
Used by: docker build

To review:

Swatinem/rust-cache is only efficient if the cache name coincides. See https://github.com/centrifuge/centrifuge-chain/actions/caches. Problem is we only have 10GB or PRs will keep deleting each other caches (that's the issue right now), ideally main branch should build a cache that all PRs use.
Sccache is now stored in Gcloud - This should hopefully keep the caches below 10G while having some shared cache for tests and checks (less critical than docker and wasm builds). Gcloud bucket is in the DEV account which means developers can potentially also use the cache from their laptops. https://console.cloud.google.com/storage/browser/centrifuge-chain-sccache-backend;tab=objects?forceOnBucketsSortingFiltering=true&project=peak-vista-185616&prefix=&forceOnObjectsSortingFiltering=false
If we're way below the 10GB maybe we can keep Docker cache also in GHA cache system (faster), otherwise we'll keep it in the registry.

Benhcmarks

Now benchmarks will run on the main branch for every push and ~~create a PR with the new weights~~ (creating a PR from a bot has been disabled org-wide by an admin after this, woops!).
The benchmark job will upload the benchmark results to the job's page.
To review:

Benchmark cache - The job takes over 1h each time, how to lower it?
benchmark-test - Longest job on the "PR tests". Do we need it? Can we make it shorter/faster?

Docker builds

Modified and improved Dockerfile for speed and stability, inspired in the polkadot-sdk Dockerfile.
Standardize the Docker tags using automated docker-metadata.
To review:
Decide on the Docker metadata strategy
If we have to tell a third party to take the "latest released tag". What would it be?
Decide on the docker build cache: GHA backed.

New features/enhancements

~~Code coverage: https://doc.rust-lang.org/rustc/instrument-coverage.html~~
~~Build time graphs: https://doc.rust-lang.org/nightly/cargo/reference/timings.html~~
~~(Optional) To be super rust-native, can we use https://github.com/matklad/cargo-xtask to define what needs "to be done" instead of using a bash script?~~

Checklist to set this PR as "ready"

+Add myself to codeowners

wischli

AFAICT, there are some removable artifacts left.

.github/workflows/build-wasm.yml

.github/workflows/run-benchmarks.yml

.github/workflows/sanity-checks.yml

wischli

Thanks so much for taking the time to pimp up our CI. Looking awesome and blazingly fast 🚀

lemunozm · 2023-11-17T11:58:24Z

.github/workflows/sanity-checks.yml

    strategy:
      matrix:
-        runtime: [development, altair, centrifuge]
+        runtime: [altair, centrifuge]


Why not development?

I thought we didnt need to check dev benchmarks to save time. I know this kind of contradicts my argument of keeping benchmarks in Dev. However, in the past releases I merged Centrifuge benchmarks to Dev because there was no runtime benchmark pipeline. I think thats feasible because this way we mimic mainnet.

How about we iterate over that when we build the next feature, which will probably first only exist in Dev runtime?

Since this are PR checks, let's keep it minimal, this won't actually publish or set benchmarks anywhere, it's just to check that the benchmarks run
Do we really need all 3 to run?

lemunozm · 2023-11-17T12:01:46Z

.github/CODEOWNERS

+.github/workflows @gpmayorga
+.github/actions @gpmayorga


we could add @wischli too, for the bus factor

lemunozm · 2023-11-17T12:02:35Z

.github/workflows/docs.yml

    steps:
      - name: Checkout repository
-        uses: actions/checkout@v3
+        uses: actions/checkout@3df4ab11eba7bda6032a0b82a6bb43b11571feac #v4


Out of curious: Why pointing to a hash directly?

In the unlikely case that the actions/checkout repository introduces a bad update or let's say someone hacks it and introduces malicious code, our pipeline will run the update without questions if we do v1 vs the code commit. It's a way of freezing the code we run from others.
Keep in mind some of the CI pipelines have access to our Google Cloud credentials so a malicious change can potentially obtain access too.
More info: https://docs.github.com/en/actions/security-guides/security-hardening-for-github-actions#using-third-party-actions

lemunozm · 2023-11-17T12:03:44Z

.github/workflows/run-benchmarks.yml

+    runs-on: ubuntu-latest-8-cores
+    strategy:
+      matrix:
+        runtimes: [centrifuge, altair]


I think now Demo uses development we should add also here the weights. See this slack thread

I would also add development here. But lets have this in a second PR where we should also add the possibility to build development wasm for upgrades in dev and demo.

lemunozm · 2023-11-17T12:04:21Z

.github/workflows/sanity-checks.yml

+    strategy:
+      matrix:
+        target: [test-general, test-integration,
+                 lint-fmt, lint-clippy, cargo-build] # ,lint-taplo]


Should be list-taple reenable?

Our intention was to re-enable it in a follow up PR in case it doesnt work out of the box. WDYT?

Every time I enabled it it failed consistently, I think it requires a bit of work, I agree on dealing with it in separate PRs

lemunozm · 2023-11-17T12:04:45Z

.github/workflows/xperimental-codecov.yml.commented

+            echo "---- GENERATE CODE COVERAGE ----"
+            echo "# Install Tarpaulin"
+            cargo install --locked cargo-tarpaulin
+            # make Cargo.toml


Can be this line removed?

This file is not used right now but kept as a wip state to introduce code cov.

Yes, this will be a separate PR

wischli

Re-approving. Unfortunately, my approval is not sufficient.

gpmayorga · 2023-11-17T15:14:44Z

All tasks for the future:
https://centrifuge.hackmd.io/Ftgot4ilSxK5ERyrEujnBg?both#ToDo-after-merging-1551

mustermeiszer

I will approve based on the knoweledge I have but more on the fact that @wischli and @gpmayorga spent so much time on this. Thanks a lot for the improvements!!!

mustermeiszer · 2023-11-17T15:13:45Z

.github/workflows/run-benchmarks.yml

+    runs-on: ubuntu-latest-8-cores
+    strategy:
+      matrix:
+        runtimes: [centrifuge, altair]


I would also add development here. But lets have this in a second PR where we should also add the possibility to build development wasm for upgrades in dev and demo.

gpmayorga added 7 commits September 14, 2023 23:31

Renmove unnecessary/old workflow files

943ca6a

Add PR common checks and wasm build(s)

9ea2c6f

Modify docs buld

298c307

modify benchmark

06dc4ef

Modify CI script (simpler)

77948d5

Add docker build

949dbeb

Add prep action for common Ubuntu steps.

934145d

+Add myself to codeowners

gpmayorga requested review from mustermeiszer and NunoAlexandre as code owners September 15, 2023 03:46

gpmayorga mentioned this pull request Sep 15, 2023

Start rebuilding the GH actions for speed #1542

Closed

4 tasks

separate sccache gcloud action

3641a11

gpmayorga force-pushed the ci-rewrite-n-cache branch 4 times, most recently from dc4ecc0 to 54bc71f Compare September 15, 2023 04:19

fix dockertag pattern

98ca8f3

gpmayorga force-pushed the ci-rewrite-n-cache branch from 54bc71f to 98ca8f3 Compare September 15, 2023 04:28

gpmayorga and others added 5 commits September 15, 2023 00:35

delete old benchmark check

c2bc2cf

Trick the wasm publish for this branch

79eff04

new benchmark pipeline

bb5713e

additional cache options

c6be052

exclude runtime integration tests from dockerfile

309afdb

gpmayorga had a problem deploying to production September 15, 2023 05:46 — with GitHub Actions Failure

gpmayorga had a problem deploying to production September 15, 2023 05:46 — with GitHub Actions Error

gpmayorga had a problem deploying to production September 15, 2023 06:18 — with GitHub Actions Error

gpmayorga temporarily deployed to production November 16, 2023 17:51 — with GitHub Actions Inactive

wischli reviewed Nov 16, 2023

View reviewed changes

review of sanity-checks

13e2085

gpmayorga temporarily deployed to production November 17, 2023 10:23 — with GitHub Actions Inactive

Final review of CI PR with @wischli

3e9b865

gpmayorga temporarily deployed to production November 17, 2023 11:22 — with GitHub Actions Inactive

wischli previously approved these changes Nov 17, 2023

View reviewed changes

gpmayorga enabled auto-merge (squash) November 17, 2023 11:58

lemunozm reviewed Nov 17, 2023

View reviewed changes

Use Rust docker image for building the binary

bfc67a5

gpmayorga dismissed wischli’s stale review via bfc67a5 November 17, 2023 12:32

gpmayorga temporarily deployed to production November 17, 2023 12:33 — with GitHub Actions Inactive

wischli previously approved these changes Nov 17, 2023

View reviewed changes

Fix review comments

2fac7af

gpmayorga dismissed wischli’s stale review via 2fac7af November 17, 2023 15:15

gpmayorga temporarily deployed to production November 17, 2023 15:15 — with GitHub Actions Inactive

mustermeiszer approved these changes Nov 17, 2023

View reviewed changes

gpmayorga merged commit 8d78e6f into main Nov 17, 2023
16 checks passed

wischli mentioned this pull request Nov 23, 2023

Failed to run testnet node using docker image #1620

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GHA pipeline rewrite for ease and speed #1551

GHA pipeline rewrite for ease and speed #1551

gpmayorga commented Sep 15, 2023 •

edited

Loading

wischli left a comment

wischli left a comment

lemunozm Nov 17, 2023

wischli Nov 17, 2023 •

edited

Loading

gpmayorga Nov 17, 2023

lemunozm Nov 17, 2023

gpmayorga Nov 17, 2023

lemunozm Nov 17, 2023

gpmayorga Nov 17, 2023

lemunozm Nov 17, 2023

mustermeiszer Nov 17, 2023

lemunozm Nov 17, 2023

wischli Nov 17, 2023

gpmayorga Nov 17, 2023

lemunozm Nov 17, 2023

wischli Nov 17, 2023

gpmayorga Nov 17, 2023

wischli left a comment

gpmayorga commented Nov 17, 2023

mustermeiszer left a comment

mustermeiszer Nov 17, 2023

		.github/workflows @gpmayorga
		.github/actions @gpmayorga

GHA pipeline rewrite for ease and speed #1551

GHA pipeline rewrite for ease and speed #1551

Conversation

gpmayorga commented Sep 15, 2023 • edited Loading

Description

Changes and Descriptions

Cache features

Benhcmarks

Docker builds

New features/enhancements

Checklist to set this PR as "ready"

wischli left a comment

Choose a reason for hiding this comment

wischli left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wischli Nov 17, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

wischli left a comment

Choose a reason for hiding this comment

gpmayorga commented Nov 17, 2023

mustermeiszer left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gpmayorga commented Sep 15, 2023 •

edited

Loading

wischli Nov 17, 2023 •

edited

Loading